Tidy Tuesday - Trash Wheel Collection Data

5th March 2024

Author

Arun Koundinya Parasa

Published

March 8, 2024

0. Loading Libraries

import pandas as pd
import ssl
from dfply import *
from plotnine import *
import numpy as np

from sklearn.preprocessing import StandardScaler
import seaborn as sns
import matplotlib.pyplot as plt

1. Data Loading

## adding below ssl line as it is giving ssl error locally in my machine
ssl._create_default_https_context = ssl._create_unverified_context

trashwheels = pd.read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2024/2024-03-05/trashwheel.csv')

2. Plotting

2.1. Trash Weight & Volume Over Years

Findings:

  • Consistent Growth: Trash weight and its corresponding volume consistently show an upward trend, indicating a year-on-year increase.
  • Post-COVID Surge: Particularly noteworthy is the exponential rise in trash weight observed post the COVID-19 pandemic, highlighting a substantial increase in waste generation during this period.

2.2. Equivalent Homes powered with Trash Dump

Equivalent Home Powered by Trash Code Plot
plot = ( trashwheels >> 
 group_by(X.Year) >>
 summarize(HomesPowered=np.sum(X.HomesPowered)) >> 
 ggplot(aes(x = "Year",y="HomesPowered")) + 
 geom_line(color = "brown") +
 theme_minimal()
 )

plot.save("HomePowered.png", width=8, height=6)

Findings:

  • Intuitive Observation: It is evident from the data that the number of homes powered by trash is intuitively increasing over time.

2.4. Correlation Heat Map B/w Trash Weight & Content

Trash Correlation Code Plot
selected_columns = ["Weight", "PlasticBottles", "Polystyrene", "CigaretteButts", "GlassBottles", "PlasticBags", "Wrappers", "SportsBalls"] 
selected_data = trashwheels[selected_columns]

scaler = StandardScaler()
scaled_data = scaler.fit_transform(selected_data)

scaled_df = pd.DataFrame(scaled_data, columns=selected_columns)

correlation_matrix_scaled = scaled_df.corr()

plt.figure(figsize=(6, 4))
sns.heatmap(correlation_matrix_scaled, annot=True, cmap='coolwarm', fmt=".2f", linewidths=.5)
plt.title('Correlation Heatmap (Scaled Data)')
plt.savefig('correlation_heatmap.png')

Findings:

  • Limited Observations: Upon examination, I did not observe a significant association as expected. However, a noticeable correlation exists between the usage of plastic bottles and related products such as plastic bags, wrappers, and polystyrene.
  • Material Association: The correlation between plastic bottles and sport balls suggests a potential association in the manufacturing process, possibly sharing similar synthetic raw materials.