Clarfication on Gradient descent for neural networks


Week 3 - Gradient Descent for Neural Networks

Why is dz^{[1]} equal to the loss?

Where’s the derivative of the sigmoid function?

Why is db simply equal to the sum of dz columns?

There are a lot of threads discussing this. Please check this one, two, three, and four.