C2W2A1 ex 6 - update Adam params - Why t is among inputs?

Francis60 · April 14, 2024, 2:38pm

Exercise 6 - update_parameters_with_adam

Hello everyone,

I am wondering why t = 2 is included in update_parameters_with_adam() function’s inputs.

To my understanding “t” varies from 1 to “L” and equals “l” in the loop.
t = 1 = l at first iteration
t = 2 = l at second iteration

Is it just to puzzle us or did I miss something in the course ?

Best regards,
Francis

TMosh · April 14, 2024, 4:32pm

‘t’ is used to compute the second moment.

The doctext could be a lot clearer about this. I’ll submit a support issue for it.

TMosh · April 14, 2024, 4:33pm

No, “L” is computed from the parameters.

Francis60 · April 14, 2024, 4:53pm

On slide 24 of the course wrap-up, it says on each iteration of t:

t is not a frozen value.

TMosh · April 14, 2024, 5:02pm

You are correct, it isn’t a frozen value. It’s a function parameter.

The notebook only tests the function using one value for t.

That’s the most common use, as t=2 represents the 2nd moment.

TMosh · April 14, 2024, 5:07pm

Referencing Wikipedia:

Only uses the 2nd moment.

Francis60 · April 14, 2024, 5:30pm

Yes I noticed :

Clarification about Upcoming Adam Optimization Video
Please note that in the next video at 2:44, the following picture is correct. However, later in the video, the db² lost the ²

But it does not answer my first question.

TMosh · April 14, 2024, 5:33pm

Can you repeat your question? I’m not sure what you’re referring to exactly.

Francis60 · April 14, 2024, 6:03pm

At last I found where I misunderstood

t is incremented at every mini-batch loop and not for each consecutive layers during update_parameters_with_adam.

def model(X, Y, ...
.../...
    # Optimization loop
    for i in range(num_epochs):
.../...
        elif optimizer == "adam":
            t = t + 1 # Adam counter
            parameters, v, s, _, _ = update_parameters_with_adam(
                 parameters, grads, v, s, t,
                 learning_rate, beta1, beta2,  epsilon)

TMosh · April 14, 2024, 11:12pm

Thanks for the details.

TomRox · July 16, 2024, 1:45am

I searched and found this thread about the elusive t, which is close to my question so I won’t start another thread.
Here’s my question: the input parameter t is never referenced within the body of the update_parameters_with_adam() method, so why is t even there? It is also declared and assigned a value in the model() method, which calls update_parameters_with_adam(), but again I don’t see why the t is even there.
Am I missing something in the code? Thanks.

paulinpaloalto · July 16, 2024, 2:53am

If you are not using t in your update_parameters_with_adam function, then that is a bug. Please take another careful look at the formulas as they are given both in the lectures and in the notebook. The version in the notebook is pretty small, so it’s hard to read, but maybe it would help to use your browser to “zoom in” on that image to see where the t is used in several places as an exponent.

TomRox · July 16, 2024, 4:16pm

Thank you so much! What fooled me is my code passed all tests, but the testing parameters must be setup in such a way that my bug was not detected. Next time I’ll put on better glasses so I can see things like the little t. Again, thanks for the quick response.

paulinpaloalto · July 16, 2024, 4:27pm

Interesting. It’s always difficult to write test cases that cover all possible errors, but this is a pretty critical one. I will take a look and try to understand why the test cases miss that mistake. Maybe we can suggest improvements to the test cases.

So some good may come out of your experience in addition to the lesson about having your reading glasses handy.

TomRox · July 16, 2024, 7:06pm

Great. To help you - I originally misread the t as a 2, so I was squaring the biases. As I mentioned, it passed all tests and when I submitted for grading I received 100%. I have since corrected the code, and it still passes the tests. Thanks again for your help.

paulinpaloalto · July 16, 2024, 7:22pm

Thanks for the additional information. I checked and it’s exactly that they use t = 2 as the one test case in the notebook. And apparently the grader must use the same value.

The actual training should come out differently with your implementation, but they don’t really have a test for that case.

I’ll file this as a bug.

Topic		Replies	Views
Something wrong with programming assignment week 2 Improving Deep Neural Networks: Hyperparameter tun coursera-platform	1	424	July 7, 2023
For update_parameters_with_adam in week 2 practice 6, I don't really understand the equation Improving Deep Neural Networks: Hyperparameter tun coursera-platform	1	555	October 10, 2022
Week 2 Adam: the difference between "t" vs "l" Improving Deep Neural Networks: Hyperparameter tun coursera-platform	2	529	January 17, 2022
Week2 - assignment 1 - ex6 Improving Deep Neural Networks: Hyperparameter tun coursera-platform	3	614	January 3, 2022
Why t used in computation of v_corrected and s_corrected instead of l? Improving Deep Neural Networks: Hyperparameter tun week-module-2 , coursera-platform	4	261	January 20, 2024

C2W2A1 ex 6 - update Adam params - Why t is among inputs?

Related topics