Tidy Tuesday - Fiscal Sponsors

12th Mar 2024

Author

Arun Koundinya Parasa

Published

March 21, 2024

0. Loading Libraries

import pandas as pd
from dfply import *
from plotnine import *
import ssl

1. Data Loading and Data Manipulation

## adding below ssl line as it is giving ssl error locally in my machine
ssl._create_default_https_context = ssl._create_unverified_context

data = pd.read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2024/2024-03-12/fiscal_sponsor_directory.csv')

data = ( data >> 
        mutate( 
               fiscal_sponsor_since = 2022 - X.year_fiscal_sponsor,
               fee = X['fiscal_sponsorship_fee_description'].str.extract(r'(\d[-+]?%?.?\d?%)')
               )
        ) >>  ( 
               mutate(
    fee = X['fee'].str.extract(r'(^\d?+[\.]?\d?+)'))
               )

data['fee'] = data['fee'].astype(float)

2. Plotting

2.1. Total Sponsorships Duration and Projects

Total Sponsorships Duration and Projects Plot Code
(data >> 
 ggplot(aes(x= "fiscal_sponsor_since",y="n_sponsored",color="year_fiscal_sponsor")) +
 geom_point() + 
 theme_minimal() + 
 labs(
     x = "Years since Fiscal Sponsorship",
     y = "Number of Sponsored Projects",
     title = "Relationship between Total Sponsorships and Duration of Sponsorship",
     subtitle = "Legend of colors is on Fiscal Sponsorship year",
     caption = "Axis is zoomed to have a better visualization"
 ) +
  guides(color=guide_legend(title="")) +  # Use an empty string to remove the legend title
  theme(
      legend_position=(0.5, 0.85),
      panel_grid_minor_x = element_blank(),
      panel_grid_minor_y = element_blank()
  ) + 
   coord_cartesian( 
    ylim = (0, 400),
    xlim = (0,75)
    ) + 
   annotate(
       "text",
       x = 60,
       y = 200,
       label = "Older Sponsors doesn't mean \n more sponsored projects ",
       color = "brown",
       size = 8
       )
 )

Sponsors who began sponsoring in recent years tend to have a lower number of sponsored projects, which aligns with our intuition. However, sponsors who commenced sponsorship fifty years ago exhibit a comparable number of projects to those of more recent sponsors. Upon closer examination of the last twenty years, the number of projects appears to demonstrate a linear relationship with the launch of sponsorship

2.2. Histogram of Total Sponsored Projects

Histogram of Total Sponsored Projects Plot Code
(data >> 
 ggplot(aes(x="n_sponsored")) +
 geom_histogram(position="dodge",fill = "#4169E1") + 
 theme_minimal() +
 labs (
     x = "Number of Sponsored Projects",
     y = "",
     title = "Histogram of total Sponsored projects"
 ) + 
 theme(
      panel_grid_major_y = element_blank(),
      panel_grid_minor_x = element_blank(),
      panel_grid_minor_y = element_blank()
  ) + 
   annotate(
       "text",
       x = 200,
       y = 80,
       label = "Most of the sponsors \n sponsored only one/two project ",
       color = "brown",
       size = 8
       )
 )

Most sponsors have one or two projects, while a few have a significantly higher number, indicating a long tail of active involvement.

2.3. Histogram of Fiscal Sponsorship Fee

Histrogram of Fiscal Sponsorship Fee Plot Code
(data >> 
 ggplot(aes(x="fee")) +
 geom_histogram(position="dodge", fill = "#708090") +
 theme_minimal() +
 labs (
     x = "Sponsorship Fee%",
     y = "",
     title = "Histogram of fees associated with fiscal sponsorship",
     caption = "Axis is zoomed to have a better visualization"
 ) +  
 theme(
      panel_grid_major_y = element_blank(),
      panel_grid_minor_x = element_blank(),
      panel_grid_minor_y = element_blank()
  ) + 
   coord_cartesian( 
    xlim = (0, 15),
    ) + 
   annotate(
       "text",
       x = 13,
       y = 50,
       label = "Median range of sponsorship \n fee is b/w 5% & 10% ",
       color = "brown",
       size = 8
       )
 )

As annotated in the plot most of the fiscal sponsorship fee starts at 5% to 10%.