Concept in Initialization Assignment-Help needed in understanding

amolsj · November 7, 2021, 9:08am

Hello,
In the Initialization Assignment, it is mentioned " As you can see with the prediction being 0.5 whether the actual ( y ) value is 1 or 0 you get the same loss value for both, so none of the weights get adjusted and you are stuck with the same old value of the weights."
I am unable to understand why and how the statement -" As you can see with the prediction being 0.5 whether the actual ( y ) value is 1 or 0 you get the same loss value for both"- implies that - “so none of the weights get adjusted and you are stuck with the same old value of the weights.” Please help understand the concept behind this conclusion drawn.
If for any general binary classification problem(X,y), we get the same non-zero loss value for training examples for both categories of y (0 and 1), does this always mean that the weights will stop getting updated during the gradient descent (even if the cost function gradient values are non zero) ?

paulinpaloalto · November 8, 2021, 1:15am

The real point is that if you initialize with all zeros, then the gradients are zero. That is why no learning can take place. Here’s a thread which goes through the math behind that.

amolsj · December 26, 2021, 6:46pm

Ok, thank you Sir. Thanks for the help.

Danielu · February 22, 2022, 8:59pm

The answer to your last question is generally no - having the same non-zero loss value for training examples of both categories doesn’t imply zero gradients. The statement you cite from the assignment happens to hold just because the specific dataset used in the assignment has exactly the same number of positive and negative examples so that the gradients of the positive and negative examples cancel each other. If you reduce just one example from the dataset, the gradients are no longer zero.

The main issue with zero initialization is therefore not zero gradients, but symmetry between different neurons in the same layer.

paulinpaloalto · February 22, 2022, 9:02pm

That’s a good point. The symmetry issues are also covered on that thread I linked above, if you read it all the way through.

kenroa89 · March 11, 2025, 8:26am

Thanks for sharing your query! It’s great that you are digging deeper into the concept of initialization and weight adjustment in neural networks — it’s definitely a tricky part to grasp initially. Regarding your confusion, the statement emphasizes that if your model’s prediction is consistently 0.5 regardless of whether yyy is 0 or 1, the loss function gives the same value, leading to zero gradient with respect to the weights. This essentially halts the learning process, leaving the weights unchanged.

I remember struggling with similar concepts during my learning phase, especially in understanding how gradient descent interacts with the loss function. While searching for guidance, I came across Math Assignment Help from www.mathsassignmenthelp .com, and their clear and practical solutions really helped me break down the math behind such scenarios. You might want to give them a try if you face more such challenges. It’s definitely worth exploring!

Topic		Replies	Views
Week 1, Programming Assignment initialization, Exercise 1 - initialize_parameters_zeros Improving Deep Neural Networks: Hyperparameter tun	8	828	December 15, 2023
Why don't weights get adjusted when initialized to 0? Improving Deep Neural Networks: Hyperparameter tun	2	478	August 30, 2023
Zero initialization of weights Improving Deep Neural Networks: Hyperparameter tun	2	668	October 27, 2021
Course 2 Initialization with zero weights and none zeros bias Improving Deep Neural Networks: Hyperparameter tun	3	569	May 14, 2021
Course 2 Week 1 PA 1: Why does zero weight cause no change in loss Improving Deep Neural Networks: Hyperparameter tun	2	631	January 5, 2022

Concept in Initialization Assignment-Help needed in understanding

Related topics