import pandas as pd
import numpy as np
import tensorflow as tf
import tensorflow.keras.layers as tfl
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error
In this blog, I will share my insights gained from analyzing a simulated dataset comprising 10,000 rows. Specifically, I’ll delve into a comparative analysis between simple models and complex models.
As a foundational understanding, it is widely acknowledged in the field that for simple handcrafted datasets of smaller to modest sizes simpler models outperform neural network models.
Let us validate the same through practical experimentation and exploration.
0. Loading Libraries
1. Generating Simulated dataset
= np.random.normal(size=10000)
x1 = np.random.normal(size=10000) * 2
x2 = np.power(x1,2)
x3
= x1 + x2 + x3
y
= (x1 * np.exp(x2) + np.exp(-x3) * x2 ) * x1 + y
y1
= pd.DataFrame({
finaldf "X1" : x1,
"X2" : x2,
"X3" : x3,
"Y" : y
})
= pd.DataFrame({
finaldf1 "X1" : x1,
"X2" : x2,
"X3" : x3,
"Y" : y1
})
print(finaldf.head())
print(finaldf1.head())
X1 X2 X3 Y
0 1.524132 -0.934670 2.322977 2.912439
1 -0.813685 3.410324 0.662084 3.258723
2 0.183997 -3.398534 0.033855 -3.180682
3 0.161710 -4.845459 0.026150 -4.657599
4 -0.567619 3.020272 0.322191 2.774844
X1 X2 X3 Y
0 1.524132 -0.934670 2.322977 3.685127
1 -0.813685 3.410324 0.662084 21.872114
2 0.183997 -3.398534 0.033855 -3.784055
3 0.161710 -4.845459 0.026150 -5.420728
4 -0.567619 3.020272 0.322191 8.136591
Here we have created two dataframes;
- One with simple linear relationship
- Other with complex relationship with exponential functions
2. Linear Regression Sci-Kit Learn
2.1 Linear Relationship Model
= finaldf[["X1","X2","X3"]]
X = finaldf[["Y"]] Y
= LinearRegression()
lmmodel
lmmodel.fit(X,Y)
LinearRegression()In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
LinearRegression()
lmmodel.coef_
array([[1., 1., 1.]])
= lmmodel.predict(X)
y_pred
mean_squared_error(Y, y_pred)
2.3287394852055727e-30
Because of the linear relationship between the dependent and indepdent variables; the mse
is very low.
2.2 Complex Relationship Linear Model
= finaldf1[["X1","X2","X3"]]
X1 = finaldf1[["Y"]] Y1
= LinearRegression()
lmmodel1
lmmodel1.fit(X1,Y1)
LinearRegression()In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
LinearRegression()
lmmodel1.coef_
array([[2.69112711, 8.39388391, 9.43535975]])
= lmmodel.predict(X1)
y_pred
mean_squared_error(Y1, y_pred)
4385.825525517447
Because of the complex relationship between the dependent and independent variables; the mse
is very high compared to earlier model.
Model Type | Datset | MSE |
---|---|---|
Sci-Kit LM | Linear Relationship | 2.3287394852055727e-30. |
Sci-Kit LM | Complex Relationship | 4385.825525517447. |
Now let us explore Neural Network Models.
3. Tensor Flow NN Linear Models.
3.1 Shallow NN - Linear Relationship
= tf.keras.Sequential([
modetf 1,
tfl.Dense(= (3,),
input_shape = "linear"
activation
) ])
modetf.summary()
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense (Dense) (None, 1) 4
=================================================================
Total params: 4 (16.00 Byte)
Trainable params: 4 (16.00 Byte)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
compile(optimizer = "sgd",
modetf.= "mean_squared_error",
loss = ["mse"])
metrics
=100, verbose=0) modetf.fit(X,Y,epochs
<keras.src.callbacks.History at 0x7ec5a55b0f40>
0].get_weights() modetf.layers[
[array([[1.0000002 ],
[0.99999994],
[1. ]], dtype=float32),
array([1.8922645e-08], dtype=float32)]
As we have seen in the previous blog the weights are not same and accurate compared to the OLS
model.
mean_squared_error(Y, modetf.predict(X).flatten())
313/313 [==============================] - 1s 1ms/step
8.450062478432384e-14
mse
is very low.
3.2 Shallow NN - Complex Relationship
=100,verbose=0) modetf.fit(X1,Y1,epochs
<keras.src.callbacks.History at 0x7ec59cd10760>
0].get_weights() modetf.layers[
[array([[ 7.9459767],
[10.005791 ],
[19.434795 ]], dtype=float32),
array([-2.91453], dtype=float32)]
mean_squared_error(Y1, modetf.predict(X1).flatten())
313/313 [==============================] - 0s 1ms/step
4279.216985484737
mse
is high compared to earlier one.
Model Type | Dataset Type | MSE |
---|---|---|
Sci-Kit LM | Linear Relationship | 2.3287394852055727e-30. |
Sci-Kit LM | Complex Relationship | 4385.825525517447. |
Shallow NN | Linear Relationship | 8.450062478432384e-14. |
Shallow NN | Complex Relationship | 4279.216985484737. |
Now let us explore Multi-Layer Neural Network Models.
3.3 Multi-Layer NN - Linear Relationship
= tf.keras.Sequential([
modetf1 100,
tfl.Dense(= (3,),
input_shape = "linear"
activation
),1,
tfl.Dense(= "linear"
activation
) ])
modetf1.summary()
Model: "sequential_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_1 (Dense) (None, 100) 400
dense_2 (Dense) (None, 1) 101
=================================================================
Total params: 501 (1.96 KB)
Trainable params: 501 (1.96 KB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
Theory it is still a shallow linear model. As activations used are of identity function. So, we will not see any change in mse
compile(optimizer = "sgd",
modetf1.= "mean_squared_error",
loss = ["mse"])
metrics
=100, verbose=0) modetf1.fit(X,Y,epochs
<keras.src.callbacks.History at 0x7ec5902b9360>
mean_squared_error(Y, modetf1.predict(X).flatten())
313/313 [==============================] - 1s 2ms/step
3.834542424936751e-14
mse
almost remained the same for this model as to that of shallow NN model.
3.4. Multi-Layer NN - Complex Relationship
=100, verbose=0)
modetf1.fit(X1,Y1,epochs#modetf2_pred =
#mean_squared_error(Y, modetf2_pred.flatten())
mean_squared_error(Y1, np.nan_to_num(modetf1.predict(X1)).flatten())
313/313 [==============================] - 0s 1ms/step
4506.472209255081
mse
is also same for this model.
Model Type | Dataset Type | MSE |
---|---|---|
Sci-Kit LM | Linear Relationship | 2.3287394852055727e-30. |
Sci-Kit LM | Complex Relationship | 4385.825525517447. |
Shallow NN | Linear Relationship | 8.450062478432384e-14. |
Shallow NN | Complex Relationship | 4279.216985484737. |
Multi-Layer NN | Linear Relationship | 3.834542424936751e-14 |
Multi-Layer NN | Complex Relationship | 4506.472209255081 |
3.5 Mulit-Layer NN with Sigmoid - Simpler Relationship
= tf.keras.Sequential([
modetf2 4,
tfl.Dense(= (3,),
input_shape = "linear"
activation
),3,
tfl.Dense(= (3,),
input_shape = "sigmoid"
activation
),1,
tfl.Dense(= "linear"
activation
) ])
compile(optimizer = "sgd",
modetf2.= "mean_squared_error",
loss = ["mse"])
metrics
=100, verbose=0) modetf2.fit(X,Y,epochs
<keras.src.callbacks.History at 0x7ec5901c7b80>
= np.nan_to_num(modetf2.predict(X))
modetf2_pred mean_squared_error(Y, modetf2_pred.flatten())
313/313 [==============================] - 0s 1ms/step
0.013774647608438807
Upon introducing a model with a complex activation function to a simpler dataset, the mean squared error (MSE) surged significantly. This sharp increase suggests that such a model may not be suitable for this dataset.
3.6 Mulit-Layer NN with Sigmoid - Complex Relationship
=100, verbose=0)
modetf2.fit(X1,Y1,epochs mean_squared_error(Y1, np.nan_to_num(modetf2.predict(X1)).flatten())
313/313 [==============================] - 0s 1ms/step
4385.579539532295
Since the dataset has intricate complex nature, the mse
for this model remained the same.
Model Type | Dataset Type | MSE |
---|---|---|
Sci-Kit LM | Linear Relationship | 2.3287394852055727e-30. |
Sci-Kit LM | Complex Relationship | 4385.825525517447. |
Shallow NN | Linear Relationship | 8.450062478432384e-14. |
Shallow NN | Complex Relationship | 4279.216985484737. |
Multi-Layer NN | Linear Relationship | 3.834542424936751e-14 |
Multi-Layer NN | Complex Relationship | 4506.472209255081 |
Multi-Layer with Sigmoid NN | Linear Relationship | 0.013774647608438807. |
Multi-Layer with Sigmoid NN | Complex Relationship | 4385.579539532295 |
3.7. Multi-Layer NN with Sigmoid & Relu - Complex Relationship
= tf.keras.Sequential([
modetf3 3,
tfl.Dense(= (3,),
input_shape = "relu"
activation
),6,
tfl.Dense(= "sigmoid"
activation
),12,
tfl.Dense(= "sigmoid"
activation
),6,
tfl.Dense(= "sigmoid"
activation
),1,
tfl.Dense(= "linear"
activation
)
])
compile(optimizer = "sgd",
modetf3.= "mean_squared_error",
loss = ["mse"])
metrics
=1000, verbose=0)
modetf3.fit(X1,Y1,epochs
mean_squared_error(Y1, modetf3.predict(X1).flatten())
313/313 [==============================] - 1s 2ms/step
4433.14019525845
Since the dataset has intricate complex nature, the mse
for this model remained the same.
Model Type | Dataset Type | MSE |
---|---|---|
Sci-Kit LM | Linear Relationship | 2.3287394852055727e-30. |
Sci-Kit LM | Complex Relationship | 4385.825525517447. |
Shallow NN | Linear Relationship | 8.450062478432384e-14. |
Shallow NN | Complex Relationship | 4279.216985484737. |
Multi-Layer NN | Linear Relationship | 3.834542424936751e-14 |
Multi-Layer NN | Complex Relationship | 4506.472209255081 |
Multi-Layer with Sigmoid NN | Linear Relationship | 0.013774647608438807. |
Multi-Layer with Sigmoid NN | Complex Relationship | 4385.579539532295. |
Multi-Layer with Sigmoid & Relu NN | Complex Relationship | 4433.14019525845 |
This blog serves as a straightforward demonstration illustrating that when a dataset lacks intricate complexities and comprises handcrafted features where each row remains relatively consistent within a random normal distribution, simpler models outperform their more complex counterparts.
It’s important to note that while simpler models excel in such scenarios, neural networks incur significantly longer computing times. Thus, every level of complexity comes with its associated costs. Therefore, it’s crucial for us to clearly define the ultimate objective of our modeling endeavors before venturing into additional complexities.