Run Grad Check at Random Initialization

Hi, I am struggling to understand the part of the lecture on Grad Check Implementation Notes where Prof Ng spoke about the last point - “Run at random initialization; perhaps again after some training”.

Can you elaborate more on this point? or offer a different explanation?

Thank you.

Hi, @shamus.

From what I understood, it simply means that it’s not impossible that an incorrect implementation of backpropagation works initially but gets inaccurate as training progresses, so you could run gradient check at initialization and after having trained the network for a while just in case.

1 Like