Week 1 Gradient checking assignment

Hi there:
I am new to this forum. Just curious regarding the gradient_check_n function at the end of this assignment

My question is that why do we compute the grad_approx[i] first and then not use this at the following code which required us to compute the difference between grad and grad_approx? By that I meant why can’t we use grad-grad_approx[i] instead? If we should use grad_approx for calculating the difference, why do we need to compute grad_approx[i]?

Thank you in advance for anyone that could help me

Hi @PChiang,

If I understand your question correctly, note that both grad and grad_approx are vectors with shape of (num_parameters, 1)

 grad = gradients_to_vector(gradients)
 # [...]
 grad_approx = np.zeros((num_parameters, 1))

Now, grad_approx needs to be calculated element-wise (it starts as a vector of just zeros). Once each element is calculated (ie. grad_approx[i]) and when you implement the difference formula (Ex4-formula 3), it operates on the vectors and not the individual elements in them (so in the end, you are actually using grad_approx[i] under the hood :grin:).

Hope that helps,

Edit: typo and format

Thank you for your reply. I think I kind of got your point. Just another question as a python beginner: what does grad_approx[i] mean? I noticed there are many terms in this exercise that used [i], such as theta_plus[i], J_plus[i] etc. Do they mean we are actually calculating the value for each [i] term in the corresponding vector?

One last question: in this assignment “gradient checking”, I got all the answers correct as “all tests passed” but when I submitted the assignment, the grading was 80/100. Just curious what could have happened? Anything that I have missed out on?

Thank again in advance


So, the [i] term in any of your examples, is referring to the ith element of a ‘collection’. In the case of numpy it is generally a vector, but in Python in general, it would be the ith element of a list. You’ll see this notation throughout the specialization. There are some cases in which you might calculate something element by element. However, you’ll notice that using for-loops is fairly slow, compared to doing operations on matrices/batches/vectors (depending on what you are doing).

I recommend you look at your submission and check the grader output messages, to see if something pops out. Sometimes it would be a minor thing that made a calculation not match what the grader expected. Review the code you wrote and check that you didn’t modify, by accident, things you were not supposed to change.