Feedforward Neural Networks in Depth

Thank You for your your detailed explanation :grinning:

1 Like

I have PhD degree in physics and I was eager to see a more generalized mathematical derivation of forward & backward propagation from the course videos. After gaining some basic understanding and intuition of what forward and backward propagation are for from the course videos, theses notes are a great bonus to who are comfortable with multivariate calculus. I appreciate the amount of effort the course instructor put in to create these notes.


yep I thought so too, but if you observe it carefully its a same thing(As far as I can depict)

i didn’t understand this point , because if j doesn’t equal to p that mean the derivation equal 0.

Hi @Allaeddine_Boudour

I only read a few lines of the first part and this reply.

I think your understanding is correct for all hidden layers but there is exception in the output layer where softmax is used as the activation, because softmax uses z's from all nodes to compute each node’s a, so in other words, each a will depend on all of the z's.


1 Like

thank you for your explantation i appreciate it.

You are welcome, @Allaeddine_Boudour!


This is amazing. This is awesome. Beyond my expectation!!! Thank you so much I like this this is so helpful!!! I love it!!! :smiling_face_with_three_hearts: thank you!!!

I am sorry, but it appears corrupted like in the picture.
does anyone have the same problem?

I am not an expert at this but it might be related to the browser you’re using. My Chrome works just fine. Would you try it on Chrome?

Yes, I am using Chrome latest version, and I tried on edge giving the same view.
also it gives the same view on my mobile.

I am sorry to hear that. I did a quick search with the keywords “chrome doesn’t render latex equation”, and found this:


You see - antivirus might not be the source of your problem (this answer is also 2.5 years ago) but perhaps it can be lead for you to further investigate. You might also find other helpful hints if you google for more. It sounds going to be some work for you to figure out how to get it work.

Also, given that you have tried two devices and different browsers and none work. One thing in common (but you might know more) between all of your trials could be that they are on the same network that perhaps might share some common firewall settings that sounds to be quite related to antivirus thing.

Good luck to your investigations for yourself!


I tried on mobile data, and turned the firewall off :joy: but still the same, anyway thank you very much for your response, and i will try to solve this issue.

Hello, I have a question about both the derivative of W and b,
So by the definition of the chain rule

I find it understandable to get the result provided by the first article

however in many cases including the formulas provided in the DLS course, the equations provided are

and I can’t figure out why we devided by m in both dW and db, am I missing something or is it just convenience to devide by m given that we generated the result by summing the z over examples for each node.
Thanks in advance.

Because there is m in the denominator of the cost function. And, dW and db are the derivatives of the cost function w.r.t. parameters, right?

1 Like

Yes, you’re right, the thing is when calculating the derivative of A at the final layer we used different formulas
which explains why we do the dividing by m at each layer for W and b, so I think the more efficient way is using the first formula am I right?

Wow, great job. Thank you Jonas

in the second post, i got lost trying to find dJ/dA in equation 4, dJ/dZ is expressed in function of dJ/dA but dJ/dA is not defined.
please explain

\frac{dJ}{dZ} is equal to \frac{dJ}{dA} \times \frac{dA}{dZ}. So, in \frac{dJ}{dZ}, we covers the \frac{dJ}{dA}.

Dear Mentors, May i know why should we add all terms together in this case?

Thank you.