C5W4: Transformer Architectures with TensorFlow

falonso · June 11, 2021, 11:18pm

The last assignment of course 5 week 4 was really a disaster. Nothing is understood, it is not well explained. It is much more difficult than the previous tasks. I ended up copying from the trasnformers tutorial from the tensor flow page (without being capable of understanding ). PLEASE IMPROVE IT!

arosacastillo · June 12, 2021, 10:58am

Hi Falonso,

Sorry to read your frustration. The Transformers topic is indeed complex but at the same time very interesting and a hot field right now at the NN community.

Congrats on having finished the assignment even with the difficulties you are mentioning. I believe the Coursera team is trying to improve every version of the notebook thanks to the feedback they receive, so I encourage you to pass a specific improvement note to them, saying for instance the number of the exercises that you consider not well explained and which are the useful tips they should include. Whichever worked for you, can work for other students as well, so your opinion is valuable. Try to send the feedback from the help center by email.

Happy learning,

Rosa

MuskaanManocha · June 12, 2021, 11:26am

I feel better after reading this. That it is indeed a bit complex. But rightly added in the end of the full specialization

GordonRobinson · June 13, 2021, 7:58pm

To add a little to Arosa’s comment.

You are not the first to criticize this assignment! A few weeks ago there was a Zoom meeting between the course staff and mentors that was largely initiated by that criticism. Some of us provided detailed feedback to the course staff about our personal views of the assignment (and week 4 lectures). The staff are listening and and are discussing the issues with the instructors.

In my personal comments I also used phrases similar to your “nothing understood”, so I know how you feel. If there are specifics about what you don’t understand, please share them so that the staff and instructors can get even more views about how it could be improved.

steven.j36 · August 20, 2021, 2:20am

I pretty much agree with this sentiment. Why did it switch to object-oriented programming? I was still learning to use tensorflow layers.

Subho · August 28, 2021, 10:39pm

Yes the last assignment is really a disaster.

Mrtranducdung · September 21, 2021, 4:43am

Dear @arosacastillo ,
I have passed all the test, but when I summit, I can not pass the assignment. The message is like this: Cell #18. Can’t compile the student’s code. Error: AssertionError(‘Wrong values case 1’)
Could you please help me to solve the problem?

arosacastillo · September 21, 2021, 8:11am

Hi Mrtranducdung,

Please have a read to the solutions proposed here:

It is always a good practice to do a search in the forum if people have posted previously similar errors. I found many solutions on my own as a student this way

Happy learning

Rosa

dheeraj5 · September 24, 2021, 11:38am

Hi Dheeraj here. Really feeling frustated. Completed the whole course in full flow. But in Transformer part i am unable to submit the Answers. And it grader show me the 0/100 so please help me to clear the Assessment.

canonv · October 5, 2021, 5:33am

Same frustration here. I finished all other homework by myself, except this one.

mcsmrx · October 13, 2021, 4:43pm

I am stuck in scaled_dot_product_attention_test with an exception: AssertionError: Wrong masked weights

No idea how to proceed.

Russel_Crowe · October 14, 2021, 11:33am

I got the same error message.
I found out that I forgot to subtract mask from 1 before multiplying it.
When I had it like this:
scaled_attention_logits += (1-mask) * -1.0e9
… things worked out. I hope this was helpful!

mcsmrx · October 15, 2021, 3:41am

Yes it works thanks!.

mcsmrx · October 16, 2021, 6:06pm

Thanks, that was helpful and I finished the whole assignment after that.

fabio.borges · February 6, 2022, 11:58pm

Hi,
I understand conceptually what the Q,K and V matrices are, however at this point in the encoder code:

# calculate self-attention using mha(~1 line). Dropout will be applied during training

The call requires the matrices q, k and v as arguments, but it’s not clear where those matrices come from. Since the comment says “~1 line”, I would expect them to be readily available, but where are they? Honestly, this is not a course on TensorFlow/Keras, there are other courses for those topics. Although I have consulted TF’s docs several times in this specialization, I don’t think I should be required to dig deep into TF to finish the assignments. I should just use it as an aid.
So I would appreciate if someone could please tell me what the heck do those three matrices come from, so I can pass them to the mha() call. Again, this is not a conceptual question, It’s more about the “technicalities” of the implementation, which I should not be required to know.

Thanks a lot!

Mhmemeth · March 30, 2022, 6:10pm

Likewise. Thanks for the help!

Thomas_Vermaelen · April 6, 2022, 12:47pm

a mha object has already been instantiated within the EncoderLayer class, in “def init(…)”, which already passes the q, k and v matrices.

Within the the call() method, you must call mha using self.mha(), and pass in the input “x” I believe, as such : self.mha(x)

anshbhatnagar007 · April 19, 2022, 2:28am

yes, real this assignment is way out of the scope just start to learn TensorFlow and then add the concept of OOP in that and the video also did not help that much in the assignment

Yuchen_Zheng · June 30, 2022, 11:42am

Hi everyone, does anyone meet this problem as below:
Cell #16. Can’t compile the student’s code. Error: AssertionError(‘Wrong type. Output must be a tensor’)

This happens at the last cell in Exercise 3 - scaled_dot_product_attention.
But I aslo find that the
NameError: name ‘scaled_dot_product_attention_test’ is not defined

Could anyone help to solve this problem?
Thanks a lot!

balaji.ambresh · June 30, 2022, 2:38pm

scaled_dot_product_attention_test is defined in the file public_tests.py. It’s possible that you didn’t run a cell / accidentally deleted the import.

Here’s the cell to run:

from public_tests import *

get_angles_test(get_angles)

# Example
position = 4
d_model = 8
pos_m = np.arange(position)[:, np.newaxis]
dims = np.arange(d_model)[np.newaxis, :]
get_angles(pos_m, dims, d_model)

If the import was missing from the start, please refresh your workspace and try again.
See Refresh your Lab Workspace section here

Topic		Replies	Views
Thanks & a few suggestions on C5W4 Sequence Models	1	591	August 22, 2021
C5_W4_A1 DON'T PANIC! Transformer help is on the way Sequence Models	6	1555	October 2, 2024
Pedagogy of C5 W4 A1 Transformer (Ex4: Encoder Layer) Sequence Models	43	5041	October 2, 2024
Week4 is awesome Sequence Models	1	502	May 10, 2022
[Course 5] - Week 4 \| Transformers - EncoderLayer() training error Sequence Models	8	609	August 20, 2023

C5W4: Transformer Architectures with TensorFlow

Related topics