Week2 - Derivation for Update function for w(i+1)

Alex_Suarez · January 21, 2024, 5:09pm

How do you derive the update function for w(i+1) := wIi) - a*dwi ?

How do we get from J(wi)/J’(wi) → (ai -y)*xi

I’m having trouble using Newton’s’ method.

balaji.ambresh · January 21, 2024, 5:17pm

We don’t use Newton’s method to find model weights in deep learning. Are you asking a machine learning specialization question?

Alex_Suarez · January 21, 2024, 5:19pm

No for the Deep Learning Course

balaji.ambresh · January 21, 2024, 5:19pm

Please move your topic to the correct subcategory.
Here’s the community faq to get started.

Alex_Suarez · January 21, 2024, 5:22pm

How is the weight update function derived if not using Newton’s method?

balaji.ambresh · January 21, 2024, 5:23pm

Deep learning uses gradient descent for weight updates.

Alex_Suarez · January 21, 2024, 5:29pm

got it, thank you

paulinpaloalto · January 21, 2024, 5:46pm

Yes, the update formula is just based on the meaning of the gradient. We calculate the gradient which is the multidimensional derivative of the surface that points in the direction of the fastest increase in the cost. Of course what we want to do is decrease the cost, so if we move in the opposite direction of the gradient it gives the fastest decrease of the cost (that’s why we multiply the gradient by -1). But because the gradient is just tangent to the surface, we don’t want to “jump” too far in that direction because it’s literally pointing off the surface. So we use the learning rate \alpha to modulate how far we move in that direction.

We repeat the above recipe until we get convergence near a point of minimal cost. Of course that will likely be a local minimum of the surface, but that’s a more subtle issue to be discussed later.

Alex_Suarez · January 21, 2024, 6:14pm

Thank you paulinpauloalto for the detailed explanation

Topic		Replies	Views
Parameters in Neural Network Calculus for Machine Learning and Data Science week-3	2	62	September 29, 2024
Week 3, "Gradient Descent for Neural Networks" Neural Networks and Deep Learning week-3	10	471	March 25, 2024
Unable to understand Gradient descent intuition Supervised ML: Regression and Classification week-1	4	43	February 8, 2025
Calculating Gradient descent of Softmax regression Advanced Learning Algorithms week-2	4	407	September 14, 2023
Week 3 Exercise 7 Neural Networks and Deep Learning	2	504	August 25, 2022

Week2 - Derivation for Update function for w(i+1)

Related topics