Week 3, "Gradient Descent for Neural Networks"

rmwkwok · March 22, 2024, 12:58pm

@Juheon_Chu, I can see that you were trying to find a formula to use, but your last trial has two problems that you can avoid in your future attempts:

substituted some matrices when it is requiring vectors
substituted two different matrices for z when z should take just one same vector. The same problem happens to w, x, and y as well.

@Juheon_Chu, I do not want you to spend too much time on looking for that formula. There are a few reasons:

I could not find it in that PDF.
You do not need to know how to derive this formula → to move on to the rest of this specialization, and you won’t need that to build models with Tensorflow.
If you can accept a rule that you can find from that PDF or other tutorial, why can’t you just accept the formula by the lecture? I mean, if there was a rule, how much different would there be from the formula?

@Juheon_Chu, if you insist, the only suggestion I have for you is to follow my example and build you own. It will require you to know matrix multiplication and calculus. It will require you to carefully write down the problem you want to solve. It will require you to carefully work out the steps. It took me less than 10 steps in my example as you can see, and I guess your case would be just similar to mine. Maybe it will just be a 20-30 minutes of work.

@Juheon_Chu, if you start your example, we can consider that as an exercise, and we can look at it together if you share a photo of your handwriting work (I don’t suggest to type symbols here for maths steps).

However, the choice is yours. If I were you, I would either follow my example OR, if I was not comfortable with doing maths, then I would accept the lecture’s formula. The former might take 30 minutes, and the latter takes no time. In either way, you can move on.

Cheers,
Raymond

Topic		Replies	Views
Week 3: Why dZ^[1] = W^[2]T dZ^[2] * g^[1]'(Z^[1]) Neural Networks and Deep Learning coursera-platform	3	927	February 13, 2023
W3_A1_Ex-6_What's the link between dz[1] and w[2] equation? Neural Networks and Deep Learning coursera-platform	1	595	October 23, 2022
Derivation of the gradients of W^[2], and b^[2] in the 1 hidden neuron network Neural Networks and Deep Learning week-module-3 , coursera-platform	5	44	March 18, 2025
I am not getting dz[1] in this image/video snapshot Neural Networks and Deep Learning coursera-platform	6	696	June 7, 2023
W3_A1_Derivative for hidden neural layers (Backprop) Neural Networks and Deep Learning coursera-platform	5	628	February 9, 2023

Week 3, "Gradient Descent for Neural Networks"

Related topics