C5W4 A1 excercise 6 DecoderLayer

Kejun_Zhang · March 6, 2024, 3:37pm

I have the following errors when running this block. Can anyone help me?

AssertionError Traceback (most recent call last)
in
1 # UNIT TEST
----> 2 DecoderLayer_test(DecoderLayer, create_look_ahead_mask)

~/work/W4A1/public_tests.py in DecoderLayer_test(target, create_look_ahead_mask)
180 assert np.allclose(attn_w_b1[0, 0, 1], [0.5271505, 0.47284946, 0.], atol=1e-2), “Wrong values in attn_w_b1. Check the call to self.mha1”
181 assert np.allclose(attn_w_b2[0, 0, 1], [0.32048798, 0.390301, 0.28921106]), “Wrong values in attn_w_b2. Check the call to self.mha2”
→ 182 assert np.allclose(out[0, 0], [-0.22109576, -1.5455486, 0.852692, 0.9139523]), “Wrong values in out”
183
184

AssertionError: Wrong values in out

balaji.ambresh · March 6, 2024, 5:36pm

Please click my name and message your notebook as an attachment.

balaji.ambresh · March 7, 2024, 2:43am

Please fix your code based on this code comment:

# apply layer normalization (layernorm3) to the sum of the ffn output and the output of the second block

Since you’re given the training parameter in def call, it’s a good idea to pass it to layers where applicable.

Kejun_Zhang · March 7, 2024, 1:07pm

Thank you for your hint. I realized that I used the wrong function on where I should have used layernorm3.

Once that is fixed, the test passes.

The parameters in the subclass functions are very confusing. Do I need to go to the Tensorflow documentation to find out what the call arguments for these functions, such as dropout and multiheadattention, are?

balaji.ambresh · March 7, 2024, 1:22pm

Tensorflow documentation is the goto place for components provided by the library.
Please be specific about which parts of the notebook documentation are confusing so that I can forward your request to the staff for an update.

paulinpaloalto · March 7, 2024, 3:35pm

As Balaji says, you need to consult the TensorFlow documentation. The one other thing to point out is that TF leans pretty heavily on the concepts of OOP. Because the class heirarchies are usually several levels deeps, they don’t document every inherited attribute at the “leaf” level of a given API. So, for example, there are a lots of “methods” of various APIs that inherit from the Layer class that are only documented at the top level and not at every inheritance level. So you need to keep that in mind as you look at the documentation. If you don’t find something explained, it probably means you need to “climb” the inheritance tree.

The other useful thing to note about the TF/Keras documentation is that they have a lot of high level tutorial and explanatory articles that lead you through a particular area giving you a complete picture of how to accomplish a given high level task using the various APIs.

Topic		Replies	Views
Programming Assignment: Transformers Architecture with TensorFlow-Exercise 6 - DecoderLayer Deep Learning Resources	3	325	February 13, 2024
C5W4 exercise 6 DecoderLayer call method Sequence Models	3	674	July 6, 2023
C5 W4 A1 DecoderLayer Ex6 Sequence Models	11	694	November 18, 2024
Week 4 - decoder Exercise 7 Sequence Models week-4	6	26	October 18, 2024
Course 5 Week 4 exercice 6 Decoder_layer Sequence Models	8	777	July 19, 2021

C5W4 A1 excercise 6 DecoderLayer

Related topics