Optimisation model

Hi, I am having a problem with the update function for adam algorithm. Anyone care to explain?

Hi @flaves, and welcome to Discourse. Are you referring to the update in the ADAM algorithm to model parameters? Can you be more specific as to which part of the update is less understandable?

yes I am referring to it
def update_parameters_with_adam(parameters, grads, v, s, t, learning_rate = 0.01,
beta1 = 0.9, beta2 = 0.999, epsilon = 1e-8):

This function. My codes keep giving error. Here is the error message
KeyError Traceback (most recent call last)
in
7 epsilon = 1e-2
8
----> 9 parameters, v, s, vc, sc = update_parameters_with_adam(parametersi, grads, vi, si, t, learning_rate, beta1, beta2, epsilon)
10 print(f"W1 = \n{parameters[‘W1’]}")
11 print(f"W2 = \n{parameters[‘W2’]}")

in update_parameters_with_adam(parameters, grads, v, s, t, learning_rate, beta1, beta2, epsilon)
34 # Moving average of the gradients. Inputs: “v, grads, beta1”. Output: “v”.
35 ### START CODE HERE ### (approx. 2 lines)
—> 36 v[“dW” + str(l)] = beta1 * v[“dW” + str(l)] + (1 - beta1) * grads[‘dW’ + str(l)]
37 v[“db” + str(l)] = beta1 * v[“db” + str(l)] + (1 - beta1) * grads[‘db’ + str(l)]
38 ### END CODE HERE ###

KeyError: ‘dW0’

If I’m not wrong the code in line 36 is within a for loop scope. That loop runs the index between 0 and L - 1. So, str(l) is 0 for the first iteration. Are you sure the dictionary was initialized for ‘l’ and not ‘l+1’?

I have changed it to strl(l+1) but then a new error has occured

NameError Traceback (most recent call last)
in
----> 1 parametersi, grads, vi, si = update_parameters_with_adam_test_case()
2
3 t = 2
4 learning_rate = 0.02
5 beta1 = 0.8

NameError: name ‘update_parameters_with_adam_test_case’ is not defined

Hi, @yanivh.

In my notebook weights and biases (and their derivatives) are 1-indexed, so the loop runs from 1 to L. Maybe there was a bug in a previous version of the notebook.

@flaves, did you make any changes to the for statement?

Yes I did I made changes to the for statement after taking asking for help offline…

Even after loading the initial there was still error

What could be the problem?

Is there a way I an set this lab back to default. So i code from the scearch.

Restarting the kernel doesnt do that it only goes back to the previous check point

Hi @flaves, you can do it by clicking on Help | Get latest version. I would recommend you to download your current Jupyter notebook in the case you want to reuse part of the code you’ve done already.

Kind regards

The error you are having: KeyError: dW0 suggests that there is something wrong with the for loop, it should go in the range(1, L+1)… therefore it should not check dW0. You should also check the previous function initialize_adam as there is where the dictionaries are initialized.

reloaded it. Still the same error.
My for loop was iterated from 0 to l-1 (in range(l)
then my codes the string was in range of (l+1) to be able to get the key.
This is the error
ValueError Traceback (most recent call last)
in
7 epsilon = 1e-2
8
----> 9 parameters, v, s, vc, sc = update_parameters_with_adam(parametersi, grads, vi, si, t, learning_rate, beta1, beta2, epsilon)
10 print(f"W1 = \n{parameters[‘W1’]}")
11 print(f"W2 = \n{parameters[‘W2’]}")

ValueError: not enough values to unpack (expected 5, got 3)

adjusted the initalise function here is the error
UnboundLocalError Traceback (most recent call last)
in
1 parameters = initialize_adam_test_case()
2
----> 3 v, s = initialize_adam(parameters)
4 print(“v[“dW1”] = \n” + str(v[“dW1”]))
5 print(“v[“db1”] = \n” + str(v[“db1”]))

in initialize_adam(parameters)
27
28 # Initialize v, s. Input: “parameters”. Outputs: “v, s”.
—> 29 for l in range(1,l+1):
30 ### START CODE HERE ### (approx. 4 lines)
31 v[“dW” + str(l)] = np.zeros_like(parameters[“W” + str(l)])

UnboundLocalError: local variable ‘l’ referenced before assignment

NameError Traceback (most recent call last)
in
1 parameters = initialize_adam_test_case()
2
----> 3 v, s = initialize_adam(parameters)
4 print(“v[“dW1”] = \n” + str(v[“dW1”]))
5 print(“v[“db1”] = \n” + str(v[“db1”]))

in initialize_adam(parameters)
27
28 # Initialize v, s. Input: “parameters”. Outputs: “v, s”.
—> 29 for L in range(1,l+1):
30 ### START CODE HERE ### (approx. 4 lines)
31 v[“dW” + str(l)] = np.zeros_like(parameters[“W” + str(l)])

NameError: name ‘l’ is not defined

The variable L holds the number of layers in the neural work, in our case its 2. Thus for each layer, you want to initialize the gradients/derivatives. You can refer the first code block in Python Looping Through a Range to solve it.

I have solved the range problem. There seems to be another error in the update adam function. This error
ValueError Traceback (most recent call last)
in
7 epsilon = 1e-2
8
----> 9 parameters, v, s, vc, sc = update_parameters_with_adam(parametersi, grads, vi, si, t, learning_rate, beta1, beta2, epsilon)
10 print(f"W1 = \n{parameters[‘W1’]}")
11 print(f"W2 = \n{parameters[‘W2’]}")

ValueError: not enough values to unpack (expected 5, got 3)

new error


ValueError Traceback (most recent call last)
in
7 epsilon = 1e-2
8
----> 9 parameters, v, s, vc, sc = update_parameters_with_adam(parametersi, grads, vi, si, t, learning_rate, beta1, beta2, epsilon)
10 print(f"W1 = \n{parameters[‘W1’]}")
11 print(f"W2 = \n{parameters[‘W2’]}")

in update_parameters_with_adam(parameters, grads, v, s, t, learning_rate, beta1, beta2, epsilon)
59 ### START CODE HERE ### (approx. 2 lines)
60 parameters[“W” + str(l)] = parameters[“W” + str(l)] - learning_rate * v_corrected[“dW” + str(l)] / np.sqrt(s_corrected[“dW” + str(l)] + epsilon)
—> 61 parameters[“b” + str(l)] = parameters[“b” + str(l)] - learning_rate * v_corrected[“db” + str(l)] / np.sqrt(s_corrected[“db” + str(l)] + epsilon)
62 ### END CODE HERE ###
63

ValueError: operands could not be broadcast together with shapes (2,1) (3,1)

I am really sorry to bother

Hi, I guess you already solved this one, just in case, pay attention to the way the loop was defined for l in range(1, L + 1), your change “broke” the loop.

In fact, you were not expected to change this line of code in the function you are iterating over the number of layers in the neural network (variable L defined at the beginning of the function).

Hi,

To debug this error I would suggest to check the content of parameters, v_corrected and s_corrected before performing the assignment i.e. print the content of those variables to pinpoint any potential error in the way you are performing the operation with them.

SOLVED. Thanks everyone