Trouble with Sequence Models DeepLearning.AI class program

fred_feisullin · November 27, 2024, 6:12pm

Having a problem with the Programming Assignment: Transformers Architecture with TensorFlow. This project is in week four from the Sequence Models DeepLearning.AI class.
Although I get positive indication of the successful completion of every other exercise in the Transformers Architecture with TensorFlow, I get a zero grade for the entire unit. This is preventing me from finishing the class Sequence Models DeepLearning.AI class, as I have successfully completed every other lab and quiz in the four-week Sequence Models program. I did try to download a fresh copy of the week 4 transformers lab, but I consistently get the same error results for the one section on transformer networks. (see below)

AssertionError Traceback (most recent call last)
in
1 # UNIT TEST
----> 2 scaled_dot_product_attention_test(scaled_dot_product_attention)

~/work/W4A1/public_tests.py in scaled_dot_product_attention_test(target)
73 assert np.allclose(weights, [[0.30719590187072754, 0.5064803957939148, 0.0, 0.18632373213768005],
74 [0.3836517333984375, 0.3836517333984375, 0.0, 0.2326965481042862],
—> 75 [0.3836517333984375, 0.3836517333984375, 0.0, 0.2326965481042862]]), “Wrong masked weights”
76 assert np.allclose(attention, [[0.6928040981292725, 0.18632373213768005],
77 [0.6163482666015625, 0.2326965481042862],

AssertionError: Wrong masked weights

Deepti_Prasad · November 27, 2024, 6:14pm

hi @fred_feisullin

Does this help incorrect masked weight codes

fred_feisullin · November 28, 2024, 7:09pm

Thank you foir your assistance and for the follow up. The info you provided helped to get rid of the errors for this section of the lab, but I still get a grade of zero for the entire lab, althought I passed all tests for each and every section of the lab assignments. Any thoughts on this?

paulinpaloalto · November 28, 2024, 7:16pm

What message do you get from the grader (click “Show grader output”)?

There are two general classes of issues that can cause this syndrome:

Your code is not general and fails the different tests from the grader. E.g. you hard-coded dimensions or referenced global variables.
You accidentally modified some part of the notebook that the grader depends on, but which is not really related to the solution code.

We can usually tell the difference between those two scenarios from the grader messages.

TMosh · November 28, 2024, 7:21pm

Passing the tests in the notebook does not prove your code is perfect. The grader uses different tests.

fred_feisullin · November 28, 2024, 7:36pm

Message states: Cell #4. Can’t compile the student’s code. Error: AssertionError(‘You must return a numpy ndarray’), but when I try to return a numpy arrray instead by returning as: return output.numpy(), attention_weights.numpy() in stead of the status quo which is: return output, attention_weights, I get error messages to the countrary: ---------------------------------------------------------------------------
AssertionError Traceback (most recent call last)
in
1 # UNIT TEST
----> 2 scaled_dot_product_attention_test(scaled_dot_product_attention)

~/work/W4A1/public_tests.py in scaled_dot_product_attention_test(target)
56
57 attention, weights = target(q, k, v, None)
—> 58 assert tf.is_tensor(weights), “Weights must be a tensor”
59 assert tuple(tf.shape(weights).numpy()) == (q.shape[0], k.shape[1]), f"Wrong shape. We expected ({q.shape[0]}, {k.shape[1]})"
60 assert np.allclose(weights, [[0.2589478, 0.42693272, 0.15705977, 0.15705977],

AssertionError: Weights must be a tensor

fred_feisullin · November 28, 2024, 7:38pm

Yes, but then why am I getting a zero grade for the entire lab score? Seems like a low probability event given that absolutely everything checks out as each section reports succesfully completed. Expected to get a lot higher than the zero score.

fred_feisullin · November 28, 2024, 7:39pm

fred_feisullin:

Message states: Cell #4. Can’t compile the student’s code. Error: AssertionError(‘You must return a numpy ndarray’), but when I try to return a numpy arrray instead by returning as: return output.numpy(), attention_weights.numpy() in stead of the status quo which is: return output, attention_weights, I get error messages to the countrary: ---------------------------------------------------------------------------
AssertionError Traceback (most recent call last)
in
1 # UNIT TEST
----> 2 scaled_dot_product_attention_test(scaled_dot_product_attention)

~/work/W4A1/public_tests.py in scaled_dot_product_attention_test(target)
56
57 attention, weights = target(q, k, v, None)
—> 58 assert tf.is_tensor(weights), “Weights must be a tensor”
59 assert tuple(tf.shape(weights).numpy()) == (q.shape[0], k.shape[1]), f"Wrong shape. We expected ({q.shape[0]}, {k.shape[1]})"
60 assert np.allclose(weights, [[0.2589478, 0.42693272, 0.15705977, 0.15705977],

AssertionError: Weights must be a tensor

Message states: Cell #4. Can’t compile the student’s code. Error: AssertionError(‘You must return a numpy ndarray’), but when I try to return a numpy arrray instead by returning as: return output.numpy(), attention_weights.numpy() in stead of the status quo which is: return output, attention_weights, I get error messages to the countrary: ---------------------------------------------------------------------------
AssertionError Traceback (most recent call last)
in
1 # UNIT TEST
----> 2 scaled_dot_product_attention_test(scaled_dot_product_attention)

~/work/W4A1/public_tests.py in scaled_dot_product_attention_test(target)
56
57 attention, weights = target(q, k, v, None)
—> 58 assert tf.is_tensor(weights), “Weights must be a tensor”
59 assert tuple(tf.shape(weights).numpy()) == (q.shape[0], k.shape[1]), f"Wrong shape. We expected ({q.shape[0]}, {k.shape[1]})"
60 assert np.allclose(weights, [[0.2589478, 0.42693272, 0.15705977, 0.15705977],

AssertionError: Weights must be a tensor

paulinpaloalto · November 28, 2024, 7:54pm

When you get a compile error, that means the grader can’t run any of your code, so you get 0 for everything.

Now the question is figuring out what is meant by those errors, which seem a bit contradictory, as you say.

Please check your DMs for a message from me.

fred_feisullin · November 28, 2024, 9:02pm

Again, thanks for your replies. Not sure how my code can not be interpreted, especialy since there are output plots that get generated, tables that get compiled with the correct content/values, and so on along the way. I started over with a new lab file, but get he same final results. Lasty, what do you mean by DMs.

paulinpaloalto · November 28, 2024, 9:04pm

DMs are Direct Messages or private conversations. You can recognize them in your “feed” by the little envelope icons.

paulinpaloalto · November 28, 2024, 9:06pm

The point is that is only when you run it locally. There is some error in your code that causes it to fail by throwing various exceptions when run in the context of the grader. If you get an exception in the grader, you get 0 points because the execution cannot complete.

Deepti_Prasad · November 28, 2024, 9:23pm

@fred_feisullin

Debug your codes according to below instructions

For code multiply q and k transposed. You are using incorrect python function code for multiply q and k.
In the additional hints section just before the grade cell, it mentions
you may find tf.matmul useful for matrix multiplication (check how you can use the parameter transpose_b)

Next to calculate dk, kindly use tf.shape rather than k.shape. Also as you know dk is the dimension of the keys, which is used to scale everything down so the softmax doesn’t explode. So dimension reduction is [-1] not -2.
In the same next code line, to calculate scaled attention logits, in denominator you are suppose to use tf.math.sqrt(dk) and not dk**0.5 as dk come in square root as per calculation.

While adding mask to the scaled tensor, your code is right but we have seen even not mention decimal point makes different to scaled weight, so instruction mentions to Multiply (1. - mask) by -1e9 before but you multiplied (1-mask). Make sure you multiply just the way instructions mentions before the grade cell.

While softmax is normalized on the last axis, so mention axis as -1

fred_feisullin · November 28, 2024, 10:00pm

If I am intrepreting thiws message correctly (Cell #14. Can’t compile the student’s code. Error: AssertionError(‘Weights must be a tensor’)) the code generating cell 14 is code that is already included in the notebook, meaning uneditable code: please see attached figure.

TMosh · November 28, 2024, 10:16pm

You’re not interpreting it correctly.

When you get grader feedback that names a specific cell number, it’s not referenced to your entire notebook.

The grader builds a new notebook for grading, by inserting the cells from your notebook that are marked for grading into a different notebook template that is used only for grading.

You do not have access to this grader template.

TMosh · November 28, 2024, 10:18pm

Also note that when your code cells make the grader crash, you get zero score and exactly the same error message for all functions. So they’re not really useful in isolating the problem.

A screen capture image of the detailed grader feedback (not a text copy-and-paste) would be most helpful.

fred_feisullin · November 28, 2024, 10:41pm

Thanks for the background info. There is not a whole lot of info in the grader error.

TMosh · November 28, 2024, 11:46pm

Check your personal messages for instructions.

TMosh · November 28, 2024, 11:49pm

In the UI profile, they’re actually called “Personal messages”.

paulinpaloalto · November 29, 2024, 1:36am

Notice that there are no tests or assertions in the create_look_ahead_mask function or in the cell that calls it. It literally just calls it and then prints the result. So that error message you show from the grader literally cannot be from that cell, right? As Tom pointed out, there is no reliable way to map the cell numbers from the grader to your code.

You can search the file public_tests.py to see where that assert error appears. It is in the test for scaled_dot_product_attention, so the place to start debugging is there. Deepti has given you a number of points to consider, which address code in that function.

Topic		Replies	Views
A problem with the Programming Assignment: Transformers Architecture with TensorFlow Sequence Models week-4	1	41	November 15, 2024
Error in week 4 assignment Sequence Models	3	695	August 3, 2021
Week 4 [C5_W4_A1_Transformer_Subclass_v1] Sequence Models	1	528	January 22, 2022
C5_W4_A1_Transformer_Subclass_v1 W4 UNQ C3 Sequence Models	2	615	October 17, 2021
C5_W4_A1_Transformer_Subclass_v1 problem Sequence Models	3	778	August 19, 2021

Trouble with Sequence Models DeepLearning.AI class program

Related topics