Dear Mentors/Friends, Backpropagation of dz[1] is W[2]T dz[2] * g[1]’ (z[1]) g[1]’(Z[1]) can be any activation function. dz[1] is calculated as dL/dz[1] (Derivative of DL : Loss, w.r.t dz[i]) I am confused with how this equation is obtained: W[2]T dz[2] * g[1]’ (z[1]). There was no explanation…

Course 1: Week 3 (backpropagation intuition)

Course Q&A Deep Learning Specialization Neural Networks and Deep Learning

paulinpaloalto May 21, 2021, 3:57pm 2

Prof Ng has specifically designed these courses so that they do not require the students to know any calculus (even univariate calculus, let alone matrix calculus), so he does not cover the derivations of a lot of the formulas which involve calculus. If you would like to dig deeper and have the math background, there are lots of resources available. Here’s a local thread with a bibliography, which includes text books that cover the actual math behind all this. One book that is more math oriented is Goodfellow et al, which is listed there.

Here are some good websites that will also cover the derivations of back propagation:

Here’s a website from Cornell that covers the derivation.

Here’s a good introduction to the matrix calculus you need in order to follow the above.

The Matrix Cookbook from Univ of Waterloo is also a valuable resource for general Linear Algebra topics as well as matrix calculus.

Here are some notes from Stanford CS231n that give a good overview and insights on back propagation.

Here’s a bit deeper dive on the math also from Stanford CS231n.

Here are notes from EECS 442 at Univ of Michigan.

Mentor Jonas Slalin also covers all this and more on his website. That’s just the first page in his series.

23 Likes

【Week3】 how do I calculate the "dz[1]" ？

Week 3 update_parameters, how to compute partial derivative J

Explanation for derived gradients for LSTM back-prop?

Gradience Descent Backpropagatin Calculation

dZ[1] derivation

Derivation of dz=da* g'(z) ? or dz= a- y? how is derivation of dz[1] and dz[2] different?

Formal explanation of change of order in chain rule

A doubt on Week 3 Lecture

How we get dZ1 formula in backward propagation in a one hidden layer net

Calculating Backpropagation Equation for NN

Clarfication on Gradient descent for neural networks

W2_A2_Calculation of Partial derivatives

Why a transpose needed?

How we got derivative of dz[1]=w[2]T.dz[2]*g[1]`(z[1])

Am I the only one completely lost at this derivative lesson?

How to choose between matrix multiplication and element wise multiplication during BackPropagation in Chain Rule?

Element-wise multiplication or dot product in backpropagation

Dividing by "m" in back propagation using vectorized implementation

Trouble understanding b vector back propagation

The intuition of db^[l]=dz^[l] and da^[l-1]=w^[l-1].dz^[l]

Back propagation why do we start from dZ2 and why transpose

W2_A2_Optimal "nudge" dx given for each node in computational graphs

Please help with some hints as there is a difficuly with Graded assignment output achievement correctly

Week 3 - Please explain how we got to this backward propagation result?

Transpose convolution backprop question

Neural networks Week4 Backprop da[l-1] proof

Can someone point me to basic for calculating dW[2] and db[2]

Derivative of Z1

Week 3,4: Why isn't 1/m part of dz^[L]?

Week 3 Backpropogation Derivation

Confusion in week 3 lesson for Backpropogation Derivations

Please explain $dz^{[1]} = {W^{[2]}}^{T} dz^{[2]} \times {g^{[1]}}^{'}(z^{[1]})$ in backpropogation

W3 A1 | Ex-6 | Where were dZ [1] & dW[1] derivative equations introduced?

Queries regarding backpropagation in RNNS

Backward propagation derivation

W3_A1_Ex-6_What's the link between dz[1] and w[2] equation?

Back Prop question

Derivation of backpropagation of RNN

* element-wise operation in dZ[l]

Should it have a rot180 on filter to calculate dA_prev?

Foundational math resources

Matrix Calculus

Partial Derivaties

Confused about Deep Network

Week 4 exercise 6.1

How did we calculate dz[2] in Backpropagation Intuition (8:34)?

FAQ: Frequently Asked Questions for all DLS Courses

Week 3: computing derivatives for shallow network

Backpropagation algorithm derivation

Course 1 Week 3 Backpropagation Intuition (Optional)

Week 3 - Please explain how we got to this backward propagation result?

Deep learning from a mathematical view

Relu/LRelu does not work for forward propagation in Planar_data_classification_with_one_hidden_layer

Week 4 backward propagation da[l-1] derivation

How to calculate dw(dL/dw)?

Week4- assignment 2- Difference in gradient calculation for the last layer activation in neural networks

Topic		Replies	Views
W3_A1_Derivative for hidden neural layers (Backprop) Neural Networks and Deep Learning coursera-platform	5	608	February 9, 2023
WK3 Backpropagation intuition formula demonstration Neural Networks and Deep Learning coursera-platform	4	557	June 27, 2022
Back propagation why do we start from dZ2 and why transpose Neural Networks and Deep Learning week-module-3 , coursera-platform	2	335	May 30, 2024
How did we calculate dz[2] in Backpropagation Intuition (8:34)? Neural Networks and Deep Learning coursera-platform	1	645	March 6, 2022
How we got derivative of dz[1]=w[2]T.dz[2]*g[1]`(z[1]) Neural Networks and Deep Learning week-module-3 , coursera-platform	1	232	May 7, 2024

Course 1: Week 3 (backpropagation intuition)

Related topics