I didn’t fully understand the last point that Prof. Ng mentioned. From what I understood is that the gradient descent gives accurate results when the parameters (w and b) are close to 0 and gives incorrect results when the parameters get bigger.
If the implementation of gradient descent is correct, then how can gradient descent give inaccurate results for some values of the parameters, and accurate results for other values of the parameters?
Or is it that when the gradient descent algorithm is not implemented correctly, for some values of parameters (close to 0), it appears as if the values of gradient descent are accurate, when they are actually incorrect. And when the values of parameters get bigger, it appears clearly that values of gradient descent are inaccurate?
Can someone please explain what Prof. Ng was explaining there.
Hello @Ammar_Jawed! Tom has answered your question but my reply is about setting the title for the post. Note that your title is a full question which is not quite informative. Please use titles like C2 W1 A3: Gradient Checking or something similar.
I think you misunderstood my question. This is what Prof Ng. said in the lecture video:
" But that as you run gradient descent and w and b become bigger, maybe your implementation of backprop is correct only when w and b is close to 0, but it gets more inaccurate when w and b become large. "
What does he mean by saying that backprop works correctly when the value of parameters are close to 0 and get inaccurate when the value of parameters become large?