Having a problem with the Programming Assignment: Transformers Architecture with TensorFlow. This project is in week four from the Sequence Models DeepLearning.AI class.
Although I get positive indication of the successful completion of every other exercise in the Transformers Architecture with TensorFlow, I get a zero grade for the entire unit. This is preventing me from finishing the class Sequence Models DeepLearning.AI class, as I have successfully completed every other lab and quiz in the four-week Sequence Models program. I did try to download a fresh copy of the week 4 transformers lab, but I consistently get the same error results for the one section on transformer networks. (see below)
AssertionError Traceback (most recent call last)
in
1 # UNIT TEST
----> 2 scaled_dot_product_attention_test(scaled_dot_product_attention)
Thank you foir your assistance and for the follow up. The info you provided helped to get rid of the errors for this section of the lab, but I still get a grade of zero for the entire lab, althought I passed all tests for each and every section of the lab assignments. Any thoughts on this?
Message states: Cell #4. Canât compile the studentâs code. Error: AssertionError(âYou must return a numpy ndarrayâ), but when I try to return a numpy arrray instead by returning as: return output.numpy(), attention_weights.numpy() in stead of the status quo which is: return output, attention_weights, I get error messages to the countrary: ---------------------------------------------------------------------------
AssertionError Traceback (most recent call last)
in
1 # UNIT TEST
----> 2 scaled_dot_product_attention_test(scaled_dot_product_attention)
~/work/W4A1/public_tests.py in scaled_dot_product_attention_test(target)
56
57 attention, weights = target(q, k, v, None)
â> 58 assert tf.is_tensor(weights), âWeights must be a tensorâ
59 assert tuple(tf.shape(weights).numpy()) == (q.shape[0], k.shape[1]), f"Wrong shape. We expected ({q.shape[0]}, {k.shape[1]})"
60 assert np.allclose(weights, [[0.2589478, 0.42693272, 0.15705977, 0.15705977],
Yes, but then why am I getting a zero grade for the entire lab score? Seems like a low probability event given that absolutely everything checks out as each section reports succesfully completed. Expected to get a lot higher than the zero score.
Message states: Cell #4. Canât compile the studentâs code. Error: AssertionError(âYou must return a numpy ndarrayâ), but when I try to return a numpy arrray instead by returning as: return output.numpy(), attention_weights.numpy() in stead of the status quo which is: return output, attention_weights, I get error messages to the countrary: ---------------------------------------------------------------------------
AssertionError Traceback (most recent call last)
in
1 # UNIT TEST
----> 2 scaled_dot_product_attention_test(scaled_dot_product_attention)
~/work/W4A1/public_tests.py in scaled_dot_product_attention_test(target)
56
57 attention, weights = target(q, k, v, None)
â> 58 assert tf.is_tensor(weights), âWeights must be a tensorâ
59 assert tuple(tf.shape(weights).numpy()) == (q.shape[0], k.shape[1]), f"Wrong shape. We expected ({q.shape[0]}, {k.shape[1]})"
60 assert np.allclose(weights, [[0.2589478, 0.42693272, 0.15705977, 0.15705977],
Again, thanks for your replies. Not sure how my code can not be interpreted, especialy since there are output plots that get generated, tables that get compiled with the correct content/values, and so on along the way. I started over with a new lab file, but get he same final results. Lasty, what do you mean by DMs.
The point is that is only when you run it locally. There is some error in your code that causes it to fail by throwing various exceptions when run in the context of the grader. If you get an exception in the grader, you get 0 points because the execution cannot complete.
For code multiply q and k transposed. You are using incorrect python function code for multiply q and k.
In the additional hints section just before the grade cell, it mentions
you may find tf.matmul useful for matrix multiplication (check how you can use the parameter transpose_b)
Next to calculate dk, kindly use tf.shape rather than k.shape. Also as you know dk is the dimension of the keys, which is used to scale everything down so the softmax doesnât explode. So dimension reduction is [-1] not -2.
In the same next code line, to calculate scaled attention logits, in denominator you are suppose to use tf.math.sqrt(dk) and not dk**0.5 as dk come in square root as per calculation.
While adding mask to the scaled tensor, your code is right but we have seen even not mention decimal point makes different to scaled weight, so instruction mentions to Multiply (1. - mask) by -1e9 before but you multiplied (1-mask). Make sure you multiply just the way instructions mentions before the grade cell.
While softmax is normalized on the last axis, so mention axis as -1
If I am intrepreting thiws message correctly (Cell #14. Canât compile the studentâs code. Error: AssertionError(âWeights must be a tensorâ)) the code generating cell 14 is code that is already included in the notebook, meaning uneditable code: please see attached figure.
When you get grader feedback that names a specific cell number, itâs not referenced to your entire notebook.
The grader builds a new notebook for grading, by inserting the cells from your notebook that are marked for grading into a different notebook template that is used only for grading.
Also note that when your code cells make the grader crash, you get zero score and exactly the same error message for all functions. So theyâre not really useful in isolating the problem.
A screen capture image of the detailed grader feedback (not a text copy-and-paste) would be most helpful.
Notice that there are no tests or assertions in the create_look_ahead_mask function or in the cell that calls it. It literally just calls it and then prints the result. So that error message you show from the grader literally cannot be from that cell, right? As Tom pointed out, there is no reliable way to map the cell numbers from the grader to your code.
You can search the file public_tests.py to see where that assert error appears. It is in the test for scaled_dot_product_attention, so the place to start debugging is there. Deepti has given you a number of points to consider, which address code in that function.