Why does Complex Neural Network perform worse than simple neural Network in C2_W3 lab?

K-Aditya25 · January 20, 2024, 12:48am

As per the lectures, Andrew mentioned that Neural Networks are generally low bias machines and increasing the complexity of a neural network will almost always improve the performance of the Neural Network.

However, in the programming assignment (lab) for Week 3, we find that for a High Variance Neural Network, a simple Neural network with the following description:

Dense layer with 6 units, relu activation
Dense layer with 6 units and a linear activation
outperforms a complex Neural Network with the following description:
Dense layer with 120 units, relu activation
Dense layer with 40 units, relu activation
Dense layer with 6 units and a linear activation (not softmax)

Please note that the “complex” neural network here is the un-regularized version. Still, I feel that since it is a Neural Network, it must perform well when it’s complex rather than simple.

I’m having trouble understanding this. Please help me understand the result.

Thanks
Aditya

TMosh · January 20, 2024, 12:57am

Can you give a reference for where he said this?

“low bias” and “high variance” are essentially the same thing.

K-Aditya25 · January 20, 2024, 1:03am

Sure, in lecture named “Bias/variance and neural networks” on the timestamp 1:41-1:55.
Another thing to mention, that I missed mentioning above is that his exact statement was,
“Large neural Networks, when trained on small term moderate sized datasets are low bias machines.”

Since, low bias and high variance are essentially the same thing, and we know that to solve high variance, we can collect more data, so I don’t think that when the dataset is huge the validity of the above statement should change.

YANG_FAN · January 20, 2024, 10:25am

I do not think Andrew really said “improving the complexity of a neural network will almost always improve the performance of the Neural Network.” There must be a context. If dataset is huge, improving the complexity of a neural network might give you better performance. It won’t be always true if you only have a small dataset.

In general, if the dataset is too small, and the number of parameter to train is huge, e.g. to train a complex neural network, or any other machine learning models, the model won’t be trained enough to give you a good performance. We will most likely get an over-fitting model.

But to your specific assignment, what are the metrics did you use to compare the model performance? How big is the dataset? What is the “complex neural network” did you compare with?

K-Aditya25 · January 20, 2024, 11:25am

In my case for the assignment, the X dataset is of (800x2) dimension, which is split into training set (400x2) , Cross Validation Set (320x2) and Test Set (80x2)

The “Complex Neural Network” has the following structure, with a total 5446 parameters:

Dense layer with 120 units, relu activation
Dense layer with 40 units, relu activation
Dense layer with 6 units and a linear activation (not softmax)

The “Simple Neural Network” has the following structure, with a total of 60 parameters:

Dense layer with 6 units, relu activation
Dense layer with 6 units and a linear activation

TMosh · January 20, 2024, 4:41pm

Just a comment. In the “complex NN”, the first layer takes two input features, and turns it into 120 activations.

That’s really not realistic in practice. There is likely not enough information in the two input features to create 120 different activations.

I’d characterize that as an “overly-complex NN”. Most of those 1st layer neurons are not providing any benefit.

It’s going to be very difficult to train 5,000 parameters using only two input features.

So this isn’t a very useful example, and I would not spend much time worrying about how to compare it to the lectures.

K-Aditya25 · January 24, 2024, 8:24pm

Alright, makes sense now.
Thanks!

Topic		Replies	Views
Large Neural Networks and bias/variance Advanced Learning Algorithms week-2	1	471	February 24, 2023
What does it mean? " Large neural networks are low bias machines? Advanced Learning Algorithms week-3	3	521	April 5, 2023
C2_W3_Bias/variance and neural networks Advanced Learning Algorithms week-3	1	181	April 7, 2024
Simple Model better than Complex Model? Advanced Learning Algorithms week-3	5	499	September 21, 2023
Does making the neural network smaller reduces the variance? Improving Deep Neural Networks: Hyperparameter tun	1	500	July 25, 2022

Why does Complex Neural Network perform worse than simple neural Network in C2_W3 lab?

Related topics