Initializing parameters in feedforward neural network

MustafaaShebl · September 7, 2024, 4:27pm

In feedforward neural network, initializing the parameter W to be random and multiply it by small constant as explained in the course is understood but when I tried to implement the whole FFNN model myself and run it on cat calssifier from week 4 in Deeplearning and Neural network course, the cost decreases by order of 10^-2 and accuracy was so bad, whereas when I divided it by np.sqrt(number of nodes in previous layer) which was done in the implmentation of intialization in the imported functions it worked really well, anyone can help clarify this missunderstanding?
Thanks in advance

TMosh · September 7, 2024, 7:17pm

Depending on the statistics of the dataset and the depth of the model, you may have to adjust the weight initialization.

It’s a bit of a trial-and-error process.

MustafaaShebl · September 7, 2024, 7:46pm

Isn’t the idea of convex loss function that cost converges with any initialization for parameters (while maintaining the randomness in case of neural network) or I got something wrong?
From what I understood initialization can affect learning speed as it determines where you have started but in my case the learning was so bad.

Nevermnd · September 7, 2024, 7:53pm

Well, in DLS we learn the Xe / Havier inits.

My (simple) interpretation of the inits is you want, since we are in the end trying to do an optimization, them to be slightly excited enough (though not biased), where we can get a little ‘action’ to happen.

In contrast, imagine we were running an optimization from a starting point of all zeros-- They’d have no idea where to go.

*Or, from zero, no loss to be solved.

TMosh · September 7, 2024, 8:07pm

The neural network cost function is not convex.

But that is not why random initialization of the hidden layer weights is required. It is for “symmetry breaking”. This is a specific requirement of NN hidden layers.

MustafaaShebl · September 7, 2024, 8:18pm

Thank you all so much for the clarification.

Topic		Replies	Views
Neural Network Clarification Advanced Learning Algorithms week-2	2	18	January 3, 2025
Random Initalization in Neural Networks Neural Networks and Deep Learning week-3	15	54	September 11, 2024
Randomly initialize parameter b instead of W Neural Networks and Deep Learning	6	658	August 23, 2022
How does Random Initialization prevent convergence? Neural Networks and Deep Learning	1	552	July 7, 2021
Week 4 - initialize_parameters_deep - w initialisation redefined for Exercise 2 Neural Networks and Deep Learning	5	646	July 2, 2022

Initializing parameters in feedforward neural network

Related topics