Trouble with Sequence Models DeepLearning.AI class program

Having a problem with the Programming Assignment: Transformers Architecture with TensorFlow. This project is in week four from the Sequence Models DeepLearning.AI class.
Although I get positive indication of the successful completion of every other exercise in the Transformers Architecture with TensorFlow, I get a zero grade for the entire unit. This is preventing me from finishing the class Sequence Models DeepLearning.AI class, as I have successfully completed every other lab and quiz in the four-week Sequence Models program. I did try to download a fresh copy of the week 4 transformers lab, but I consistently get the same error results for the one section on transformer networks. (see below)


AssertionError Traceback (most recent call last)
in
1 # UNIT TEST
----> 2 scaled_dot_product_attention_test(scaled_dot_product_attention)

~/work/W4A1/public_tests.py in scaled_dot_product_attention_test(target)
73 assert np.allclose(weights, [[0.30719590187072754, 0.5064803957939148, 0.0, 0.18632373213768005],
74 [0.3836517333984375, 0.3836517333984375, 0.0, 0.2326965481042862],
—> 75 [0.3836517333984375, 0.3836517333984375, 0.0, 0.2326965481042862]]), “Wrong masked weights”
76 assert np.allclose(attention, [[0.6928040981292725, 0.18632373213768005],
77 [0.6163482666015625, 0.2326965481042862],

AssertionError: Wrong masked weights

hi @fred_feisullin

Does this help incorrect masked weight codes

1 Like

Thank you foir your assistance and for the follow up. The info you provided helped to get rid of the errors for this section of the lab, but I still get a grade of zero for the entire lab, althought I passed all tests for each and every section of the lab assignments. Any thoughts on this?

What message do you get from the grader (click “Show grader output”)?

There are two general classes of issues that can cause this syndrome:

  1. Your code is not general and fails the different tests from the grader. E.g. you hard-coded dimensions or referenced global variables.
  2. You accidentally modified some part of the notebook that the grader depends on, but which is not really related to the solution code.

We can usually tell the difference between those two scenarios from the grader messages.

1 Like

Passing the tests in the notebook does not prove your code is perfect. The grader uses different tests.

1 Like

Message states: Cell #4. Can’t compile the student’s code. Error: AssertionError(‘You must return a numpy ndarray’), but when I try to return a numpy arrray instead by returning as: return output.numpy(), attention_weights.numpy() in stead of the status quo which is: return output, attention_weights, I get error messages to the countrary: ---------------------------------------------------------------------------
AssertionError Traceback (most recent call last)
in
1 # UNIT TEST
----> 2 scaled_dot_product_attention_test(scaled_dot_product_attention)

~/work/W4A1/public_tests.py in scaled_dot_product_attention_test(target)
56
57 attention, weights = target(q, k, v, None)
—> 58 assert tf.is_tensor(weights), “Weights must be a tensor”
59 assert tuple(tf.shape(weights).numpy()) == (q.shape[0], k.shape[1]), f"Wrong shape. We expected ({q.shape[0]}, {k.shape[1]})"
60 assert np.allclose(weights, [[0.2589478, 0.42693272, 0.15705977, 0.15705977],

AssertionError: Weights must be a tensor

Yes, but then why am I getting a zero grade for the entire lab score? Seems like a low probability event given that absolutely everything checks out as each section reports succesfully completed. Expected to get a lot higher than the zero score.

Message states: Cell #4. Can’t compile the student’s code. Error: AssertionError(‘You must return a numpy ndarray’), but when I try to return a numpy arrray instead by returning as: return output.numpy(), attention_weights.numpy() in stead of the status quo which is: return output, attention_weights, I get error messages to the countrary: ---------------------------------------------------------------------------
AssertionError Traceback (most recent call last)
in
1 # UNIT TEST
----> 2 scaled_dot_product_attention_test(scaled_dot_product_attention)

~/work/W4A1/public_tests.py in scaled_dot_product_attention_test(target)
56
57 attention, weights = target(q, k, v, None)
—> 58 assert tf.is_tensor(weights), “Weights must be a tensor”
59 assert tuple(tf.shape(weights).numpy()) == (q.shape[0], k.shape[1]), f"Wrong shape. We expected ({q.shape[0]}, {k.shape[1]})"
60 assert np.allclose(weights, [[0.2589478, 0.42693272, 0.15705977, 0.15705977],

AssertionError: Weights must be a tensor

When you get a compile error, that means the grader can’t run any of your code, so you get 0 for everything.

Now the question is figuring out what is meant by those errors, which seem a bit contradictory, as you say.

Please check your DMs for a message from me.

Again, thanks for your replies. Not sure how my code can not be interpreted, especialy since there are output plots that get generated, tables that get compiled with the correct content/values, and so on along the way. I started over with a new lab file, but get he same final results. Lasty, what do you mean by DMs.

DMs are Direct Messages or private conversations. You can recognize them in your “feed” by the little envelope icons.

The point is that is only when you run it locally. There is some error in your code that causes it to fail by throwing various exceptions when run in the context of the grader. If you get an exception in the grader, you get 0 points because the execution cannot complete.

@fred_feisullin

Debug your codes according to below instructions

For code multiply q and k transposed. You are using incorrect python function code for multiply q and k.
In the additional hints section just before the grade cell, it mentions
you may find tf.matmul useful for matrix multiplication (check how you can use the parameter transpose_b)

Next to calculate dk, kindly use tf.shape rather than k.shape. Also as you know dk is the dimension of the keys, which is used to scale everything down so the softmax doesn’t explode. So dimension reduction is [-1] not -2.
In the same next code line, to calculate scaled attention logits, in denominator you are suppose to use tf.math.sqrt(dk) and not dk**0.5 as dk come in square root as per calculation.

While adding mask to the scaled tensor, your code is right but we have seen even not mention decimal point makes different to scaled weight, so instruction mentions to Multiply (1. - mask) by -1e9 before but you multiplied (1-mask). Make sure you multiply just the way instructions mentions before the grade cell.

While softmax is normalized on the last axis, so mention axis as -1

1 Like

If I am intrepreting thiws message correctly (Cell #14. Can’t compile the student’s code. Error: AssertionError(‘Weights must be a tensor’)) the code generating cell 14 is code that is already included in the notebook, meaning uneditable code: please see attached figure.


You’re not interpreting it correctly.

When you get grader feedback that names a specific cell number, it’s not referenced to your entire notebook.

The grader builds a new notebook for grading, by inserting the cells from your notebook that are marked for grading into a different notebook template that is used only for grading.

You do not have access to this grader template.

Also note that when your code cells make the grader crash, you get zero score and exactly the same error message for all functions. So they’re not really useful in isolating the problem.

A screen capture image of the detailed grader feedback (not a text copy-and-paste) would be most helpful.

Thanks for the background info. There is not a whole lot of info in the grader error.

Check your personal messages for instructions.

In the UI profile, they’re actually called “Personal messages”.

Notice that there are no tests or assertions in the create_look_ahead_mask function or in the cell that calls it. It literally just calls it and then prints the result. So that error message you show from the grader literally cannot be from that cell, right? As Tom pointed out, there is no reliable way to map the cell numbers from the grader to your code.

You can search the file public_tests.py to see where that assert error appears. It is in the test for scaled_dot_product_attention, so the place to start debugging is there. Deepti has given you a number of points to consider, which address code in that function.

1 Like