import pandas as pd
import ssl
from dfply import *
from plotnine import *
import numpy as np
from sklearn.preprocessing import StandardScaler
import seaborn as sns
import matplotlib.pyplot as plt
0. Loading Libraries
1. Data Loading
## adding below ssl line as it is giving ssl error locally in my machine
= ssl._create_unverified_context
ssl._create_default_https_context
= pd.read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2024/2024-03-05/trashwheel.csv') trashwheels
2. Plotting
2.1. Trash Weight & Volume Over Years
Weight Volume Trending Plot Code
= ( trashwheels >>
plot >>
group_by(X.Year) =np.sum(X.Weight), Volumes=np.sum(X.Volume)) >>
summarize(Weights= "Year",y="Weights",fill="Volumes")) +
ggplot(aes(x +
geom_col()
theme_minimal()
)
"TrashWeightVolume.png", width=8, height=6) plot.save(
Findings:
- Consistent Growth: Trash weight and its corresponding volume consistently show an upward trend, indicating a year-on-year increase.
- Post-COVID Surge: Particularly noteworthy is the exponential rise in trash weight observed post the COVID-19 pandemic, highlighting a substantial increase in waste generation during this period.
2.2. Equivalent Homes powered with Trash Dump
Equivalent Home Powered by Trash Code Plot
= ( trashwheels >>
plot >>
group_by(X.Year) =np.sum(X.HomesPowered)) >>
summarize(HomesPowered= "Year",y="HomesPowered")) +
ggplot(aes(x = "brown") +
geom_line(color
theme_minimal()
)
"HomePowered.png", width=8, height=6) plot.save(
Findings:
- Intuitive Observation: It is evident from the data that the number of homes powered by trash is intuitively increasing over time.
2.3. Trash by Category Trending
Trash Volume by Category Code Plot
= (trashwheels >>
plot "variable", "value", ["PlasticBottles", "Polystyrene", "CigaretteButts", "GlassBottles", "PlasticBags", "Wrappers", "SportsBalls"]) >>
gather(>>
group_by(X.Year, X.variable) =np.sum(X.value)) >>
summarize(Values>>
group_by(X.variable) =(X.Values - X.Values.mean()) / X.Values.std()) >>
mutate(Standardized_Values="Year", y="Standardized_Values")) +
ggplot(aes(x="brown") +
geom_line(color'~variable') +
facet_wrap(
theme_minimal()
)
"TrashVolumeCat.png", width=8, height=6) plot.save(
Findings:
- Increasing Usage and Disposal: The consumption and disposal of Glass Bottles, Plastic Bottles, Sport Balls, and Plastic Wrappers have witnessed a steady increase in recent years.
- Processed Food Consumption: The elevated consumption of Plastic, Wrappers, and Glass suggests a potential upward trend in the usage of processed food items, given their common packaging materials.
- Quality Impact on Dumping: The higher consumption of Sport Balls may be linked to quality factors, potentially indicating a shorter lifespan for these items than expected. This, in turn, could contribute to an increase in waste disposal.
2.4. Correlation Heat Map B/w Trash Weight & Content
Trash Correlation Code Plot
= ["Weight", "PlasticBottles", "Polystyrene", "CigaretteButts", "GlassBottles", "PlasticBags", "Wrappers", "SportsBalls"]
selected_columns = trashwheels[selected_columns]
selected_data
= StandardScaler()
scaler = scaler.fit_transform(selected_data)
scaled_data
= pd.DataFrame(scaled_data, columns=selected_columns)
scaled_df
= scaled_df.corr()
correlation_matrix_scaled
=(6, 4))
plt.figure(figsize=True, cmap='coolwarm', fmt=".2f", linewidths=.5)
sns.heatmap(correlation_matrix_scaled, annot'Correlation Heatmap (Scaled Data)')
plt.title('correlation_heatmap.png') plt.savefig(
Findings:
- Limited Observations: Upon examination, I did not observe a significant association as expected. However, a noticeable correlation exists between the usage of plastic bottles and related products such as plastic bags, wrappers, and polystyrene.
- Material Association: The correlation between plastic bottles and sport balls suggests a potential association in the manufacturing process, possibly sharing similar synthetic raw materials.