DLS Course 2 - Week 1 - Checking your derivative computation

paulinpaloalto · December 23, 2022, 6:20pm

I would state it a bit differently: we’re trying to find the best algorithm for doing “finite difference” approximations to gradients, so we can use that to confirm that our gradient logic is correct. If you have a choice between a model that gives errors with O(\epsilon) and a model that gives errors of O(\epsilon^2), then it’s clear which one is superior, right? Remember that the whole point is that \epsilon is very small. What happens if you square 10^{-7}? Wait for it … it gets a lot smaller.

Now I realize that doesn’t really answer the fundamental question originally asked here, which is why the two different ways to compute finite difference derivative estimates have those error behaviors. That requires some math, but if you already know about Taylor Expansion you should be set for that. Here’s a document from Brown University which uses Taylor Series to show the behavior of one-sided and two-sided finite differences.

Topic		Replies	Views
Gradient Checking doubts Improving Deep Neural Networks: Hyperparameter tun	1	613	April 26, 2021
Grad check threshold Improving Deep Neural Networks: Hyperparameter tun	3	582	April 20, 2021
DLS_C2_W1_Assign3_Exercise 4 - gradient_check_n Improving Deep Neural Networks: Hyperparameter tun	4	556	November 22, 2022
Numerical Approximation Improving Deep Neural Networks: Hyperparameter tun week-1	3	32	September 18, 2024
Week 3 - Last video titled: 'Gradient descent and back propagation Calculus for Machine Learning and Data Science week-3	3	19	October 11, 2024

DLS Course 2 - Week 1 - Checking your derivative computation

Related topics