Back propagation derivatives

But notice that in all cases here in DLS Course 1, we are doing binary classifications. That means that the activation at the output layer is always sigmoid and that’s the derivative that interacts with the loss function. Here’s a thread showing how all the derivatives play out at the output layer in a binary classification.