Hi Sir,
For dJ / dA[L] which is derivative of the cost function with respect to A[L] final layer, we are having below formula. This formula we got by plugging sigmoid function in the cost function. Is the formula will be same for all the activation function if we plug in to the cost function ? or the below equation could be different for different activation function after derive ? If its different where can I get the derived equation for All activation covered in the lecture video? Please kindly help on this.