Why gradients are same?

They are not the same, this is the general overseeing formula but in each case the prediction and ground truths are different!

As far as I remember Prof Andrew explains them in detail but if not then check in google to find the gradients for each case, but surely they are not the same!

Here look at this post from one of our mentors:

@rmwkwok great job here, thank you!

2 Likes