They are not the same, this is the general overseeing formula but in each case the prediction and ground truths are different!
As far as I remember Prof Andrew explains them in detail but if not then check in google to find the gradients for each case, but surely they are not the same!