Optimisation model

flaves · April 29, 2021, 8:07pm

Hi, I am having a problem with the update function for adam algorithm. Anyone care to explain?

yanivh · April 29, 2021, 8:42pm

Hi @flaves, and welcome to Discourse. Are you referring to the update in the ADAM algorithm to model parameters? Can you be more specific as to which part of the update is less understandable?

flaves · April 29, 2021, 9:20pm

yes I am referring to it
def update_parameters_with_adam(parameters, grads, v, s, t, learning_rate = 0.01,
beta1 = 0.9, beta2 = 0.999, epsilon = 1e-8):

This function. My codes keep giving error. Here is the error message
KeyError Traceback (most recent call last)
in
7 epsilon = 1e-2
8
----> 9 parameters, v, s, vc, sc = update_parameters_with_adam(parametersi, grads, vi, si, t, learning_rate, beta1, beta2, epsilon)
10 print(f"W1 = \n{parameters[‘W1’]}")
11 print(f"W2 = \n{parameters[‘W2’]}")

in update_parameters_with_adam(parameters, grads, v, s, t, learning_rate, beta1, beta2, epsilon)
34 # Moving average of the gradients. Inputs: “v, grads, beta1”. Output: “v”.
35 ### START CODE HERE ### (approx. 2 lines)
—> 36 v[“dW” + str(l)] = beta1 * v[“dW” + str(l)] + (1 - beta1) * grads[‘dW’ + str(l)]
37 v[“db” + str(l)] = beta1 * v[“db” + str(l)] + (1 - beta1) * grads[‘db’ + str(l)]
38 ### END CODE HERE ###

KeyError: ‘dW0’

yanivh · April 30, 2021, 2:47am

If I’m not wrong the code in line 36 is within a for loop scope. That loop runs the index between 0 and L - 1. So, str(l) is 0 for the first iteration. Are you sure the dictionary was initialized for ‘l’ and not ‘l+1’?

flaves · April 30, 2021, 10:41am

I have changed it to strl(l+1) but then a new error has occured

NameError Traceback (most recent call last)
in
----> 1 parametersi, grads, vi, si = update_parameters_with_adam_test_case()
2
3 t = 2
4 learning_rate = 0.02
5 beta1 = 0.8

NameError: name ‘update_parameters_with_adam_test_case’ is not defined

nramon · April 30, 2021, 12:44pm

Hi, @yanivh.

In my notebook weights and biases (and their derivatives) are 1-indexed, so the loop runs from 1 to L. Maybe there was a bug in a previous version of the notebook.

@flaves, did you make any changes to the for statement?

flaves · April 30, 2021, 1:06pm

Yes I did I made changes to the for statement after taking asking for help offline…

Even after loading the initial there was still error

flaves · April 30, 2021, 1:09pm

What could be the problem?

flaves · April 30, 2021, 1:18pm

Is there a way I an set this lab back to default. So i code from the scearch.

Restarting the kernel doesnt do that it only goes back to the previous check point

albertovilla · April 30, 2021, 1:27pm

Hi @flaves, you can do it by clicking on Help | Get latest version. I would recommend you to download your current Jupyter notebook in the case you want to reuse part of the code you’ve done already.

Kind regards

albertovilla · April 30, 2021, 1:33pm

The error you are having: KeyError: dW0 suggests that there is something wrong with the for loop, it should go in the range(1, L+1)… therefore it should not check dW0. You should also check the previous function initialize_adam as there is where the dictionaries are initialized.

flaves · April 30, 2021, 1:56pm

reloaded it. Still the same error.
My for loop was iterated from 0 to l-1 (in range(l)
then my codes the string was in range of (l+1) to be able to get the key.
This is the error
ValueError Traceback (most recent call last)
in
7 epsilon = 1e-2
8
----> 9 parameters, v, s, vc, sc = update_parameters_with_adam(parametersi, grads, vi, si, t, learning_rate, beta1, beta2, epsilon)
10 print(f"W1 = \n{parameters[‘W1’]}")
11 print(f"W2 = \n{parameters[‘W2’]}")

ValueError: not enough values to unpack (expected 5, got 3)

flaves · April 30, 2021, 2:00pm

adjusted the initalise function here is the error
UnboundLocalError Traceback (most recent call last)
in
1 parameters = initialize_adam_test_case()
2
----> 3 v, s = initialize_adam(parameters)
4 print(“v[“dW1”] = \n” + str(v[“dW1”]))
5 print(“v[“db1”] = \n” + str(v[“db1”]))

in initialize_adam(parameters)
27
28 # Initialize v, s. Input: “parameters”. Outputs: “v, s”.
—> 29 for l in range(1,l+1):
30 ### START CODE HERE ### (approx. 4 lines)
31 v[“dW” + str(l)] = np.zeros_like(parameters[“W” + str(l)])

UnboundLocalError: local variable ‘l’ referenced before assignment

flaves · April 30, 2021, 2:24pm

NameError Traceback (most recent call last)
in
1 parameters = initialize_adam_test_case()
2
----> 3 v, s = initialize_adam(parameters)
4 print(“v[“dW1”] = \n” + str(v[“dW1”]))
5 print(“v[“db1”] = \n” + str(v[“db1”]))

in initialize_adam(parameters)
27
28 # Initialize v, s. Input: “parameters”. Outputs: “v, s”.
—> 29 for L in range(1,l+1):
30 ### START CODE HERE ### (approx. 4 lines)
31 v[“dW” + str(l)] = np.zeros_like(parameters[“W” + str(l)])

NameError: name ‘l’ is not defined

RolandSherwin · April 30, 2021, 2:34pm

The variable L holds the number of layers in the neural work, in our case its 2. Thus for each layer, you want to initialize the gradients/derivatives. You can refer the first code block in Python Looping Through a Range to solve it.

flaves · April 30, 2021, 2:47pm

I have solved the range problem. There seems to be another error in the update adam function. This error
ValueError Traceback (most recent call last)
in
7 epsilon = 1e-2
8
----> 9 parameters, v, s, vc, sc = update_parameters_with_adam(parametersi, grads, vi, si, t, learning_rate, beta1, beta2, epsilon)
10 print(f"W1 = \n{parameters[‘W1’]}")
11 print(f"W2 = \n{parameters[‘W2’]}")

ValueError: not enough values to unpack (expected 5, got 3)

flaves · April 30, 2021, 3:10pm

new error

ValueError Traceback (most recent call last)
in
7 epsilon = 1e-2
8
----> 9 parameters, v, s, vc, sc = update_parameters_with_adam(parametersi, grads, vi, si, t, learning_rate, beta1, beta2, epsilon)
10 print(f"W1 = \n{parameters[‘W1’]}")
11 print(f"W2 = \n{parameters[‘W2’]}")

in update_parameters_with_adam(parameters, grads, v, s, t, learning_rate, beta1, beta2, epsilon)
59 ### START CODE HERE ### (approx. 2 lines)
60 parameters[“W” + str(l)] = parameters[“W” + str(l)] - learning_rate * v_corrected[“dW” + str(l)] / np.sqrt(s_corrected[“dW” + str(l)] + epsilon)
—> 61 parameters[“b” + str(l)] = parameters[“b” + str(l)] - learning_rate * v_corrected[“db” + str(l)] / np.sqrt(s_corrected[“db” + str(l)] + epsilon)
62 ### END CODE HERE ###
63

ValueError: operands could not be broadcast together with shapes (2,1) (3,1)

I am really sorry to bother

albertovilla · April 30, 2021, 4:42pm

Hi, I guess you already solved this one, just in case, pay attention to the way the loop was defined for l in range(1, L + 1), your change “broke” the loop.

In fact, you were not expected to change this line of code in the function you are iterating over the number of layers in the neural network (variable L defined at the beginning of the function).

albertovilla · April 30, 2021, 4:46pm

Hi,

To debug this error I would suggest to check the content of parameters, v_corrected and s_corrected before performing the assignment i.e. print the content of those variables to pinpoint any potential error in the way you are performing the operation with them.

flaves · May 1, 2021, 2:52pm

SOLVED. Thanks everyone

Topic		Replies	Views
Week2 - assignment 1 - ex6 Improving Deep Neural Networks: Hyperparameter tun	3	595	January 3, 2022
Help with adam parameters update Improving Deep Neural Networks: Hyperparameter tun	19	454	August 19, 2023
Dls course 2 week 2 Improving Deep Neural Networks: Hyperparameter tun	4	516	March 24, 2023
Week 02 Prog Assignment exercise 6 Improving Deep Neural Networks: Hyperparameter tun	2	524	August 29, 2022
Course 2, Week2, Exercise 6 Improving Deep Neural Networks: Hyperparameter tun	4	563	March 8, 2022

Optimisation model

I have changed it to strl(l+1) but then a new error has occured

Related topics