DLS Course 2 week 2 RMSprop

Iheb_Khalfallah · October 7, 2022, 6:35pm

anyone can explain to me why dW have small value and db big value ?

paulinpaloalto · October 7, 2022, 7:19pm

Please give us a bit more context for your question here. Which lecture and the time offset into the lecture that you are asking about?

One thing to note is that we apply Gradient Descent individually to each dW and db value: there is no case in which the gradients of those two different quantities interact. If you are looking at the shapes of the ellipses in the graphs Prof Ng shows, not that the axes of the ellipse are different elements of W or b. Also note that these graphs are very unrealistic since they are in 3 dimensions: the actual solution spaces we are dealing with here have (typically) hundreds or thousands of dimensions.

Iheb_Khalfallah · October 7, 2022, 7:26pm

this lecture please

paulinpaloalto · October 7, 2022, 11:11pm

Sorry, it’s been several years since I watched those lectures. I will need to watch them again in order to contribute to the discussion. I see from the diagram that this case is different than the ones I remember where the axes were different elements of w.

The rest of my day today is pretty busy, so it will likely be more than 12 hours before I can get to this. In the meantime, you might profit from just watching the lecture again from the beginning. I’ve got to believe that Prof Ng would have explained the point you are asking about.

paulinpaloalto · October 7, 2022, 11:17pm

Actually I think you can see what he means from the diagram: note that the ellipses are elongated on the W axis and squashed on the b axis. Those are “contour lines” of equal cost on the cost surface. If you think about the geometric meaning of the shapes of those ellipses, it means that the surface is much steeper in the b direction than it is in the W direction. Think of taking a vertical slice parallel to the b axis and what it means that the contours are closer together in that direction. Think of a topographical map as a good real world analog: when the contour lines are close together, that means the gradient is steep in that area in the direction perpendicular to the contour lines.

Topic		Replies	Views
RMSprop why b is y axis and w is x axis Improving Deep Neural Networks: Hyperparameter tun week-module-2 , coursera-platform	6	208	February 29, 2024
Regarding the cost equation & step graph in optimization lecture Advanced Learning Algorithms week-module-2	1	332	September 4, 2023
Intuition behind RMSprop, GD with moment and Adam Improving Deep Neural Networks: Hyperparameter tun coursera-platform	4	702	August 15, 2022
Question about gradient descent for neural network Neural Networks and Deep Learning coursera-platform	5	552	December 12, 2022
[Question/Validation] Negative J(w,b) in the lecture photo Supervised ML: Regression and Classification week-module-3	7	333	October 13, 2023

DLS Course 2 week 2 RMSprop

Related topics