Normalize question from gradient checking slide

jhchang903 · October 14, 2022, 3:26pm

In the second slide from gradient checking (p.38 of the entire C2_W1), the instructor normalizes the
L2 norm to avoid the nominator being too large. He divides it by the sum of the two vector’s L2 norm.

How could this operation normalize the result? Is there any mathematical theory to support or explain this?

paulinpaloalto · October 14, 2022, 8:33pm

Yes, but it’s not anything deep or subtle. The point is to “scale” the error. Suppose the length of the difference vector (the numerator of that expression) is 0.5. How do you know if that is a big error or not? If the actual correct vector you are trying to approximate has a length of 10^6, then 0.5 is a pretty small error. On the other hand, if the norm of the actual vector is 2, then 0.5 is a pretty big error.

Topic		Replies	Views
A doubt on gradient checking Improving Deep Neural Networks: Hyperparameter tun coursera-platform	6	427	August 21, 2023
Gradient Checking Normalization Improving Deep Neural Networks: Hyperparameter tun coursera-platform	1	551	August 23, 2021
Np.linalg.norm does not count L2 Improving Deep Neural Networks: Hyperparameter tun coursera-platform	10	613	April 1, 2023
DLS 2, Week 1, Assignment 3 - Gradient Checking final part Improving Deep Neural Networks: Hyperparameter tun coursera-platform	2	548	October 30, 2022
Gradient Checking ____ Euclidean distance Improving Deep Neural Networks: Hyperparameter tun coursera-platform	4	451	July 17, 2023

Normalize question from gradient checking slide

Related topics