C4W2_Assignment NLP Transformer Summariser Issues

jondoff · January 8, 2024, 7:16pm

Hi there,

I’m going through the C4W2_Assignment notebook. I had gone throught this course before and re-enrolled following the new tensorflow code about which I’m interested.

I’m running the notebook and my code runs. There’s an issue with slighlyt different weights to which the unit tests expects - they vary every so slightly, but they are different.

Then I’m training the model but it’s not really learning. It predicts an SOS followed by an EOS token regardless of how long I train it.

My guesses are:

I might have an issue with my decoder layer implementation (or full decoder implementation).
There might be an issue with the inference function - around masking maybe.

I’ve really gone through it over and over and I can’t find the issue though. What’s the best way of getting a little help here?

Thanks so much!

Jonathan

jyadav202 · January 9, 2024, 4:15am

Hi!
Lets start from the point where your code starts failing unit tests and you see a difference in your and expected output.
Also what do you mean your model is “not really learning”. It is possible to get an SOS token (although unlikely) after EOS. That is why a breaking condition in summarize(model, input_document) is to check for EOS.

jondoff · January 9, 2024, 3:05pm

The cell that checks the weights outputs slightly different weights to the one the unit test expects (~ 0.001 difference in each weight) bit I’m thinking that is significative in this case. Then also the cell that checks the ‘next_word’ function outputs a token and a word which are different to Predicted token: [[14859]] Predicted word: masses

What I mean by the model not learning is that during training the model predicts SOS followed immediately by EOS and it doesn’t change throughout all epocs I train. It means there’s something wrong with my implementation of course.

Any way I can share my notebook with you?

Thans so much for taking the time to reply

Jonathan

jyadav202 · January 10, 2024, 5:54am

A slight variance in weights is acceptable. So an output like Predicted token: [[8410]] Predicted word: valentin will be fine too, since the model is not trained at this point yet.
Please share your output where it gives SOS after EOS during training. We will try to catch the error on this thread as much as possible first, so that others can learn too.

jondoff · January 10, 2024, 1:55pm

Epoch 1, Loss 7.897331
Time taken for one epoch: 247.514554977417 sec
Example summarization on the test set:
True summarization:
[SOS] hannah needs betty’s number but amanda doesn’t have it. she needs to contact larry. [EOS]
Predicted summarization:
[SOS] [EOS]

Epoch 2, Loss 6.726731
Time taken for one epoch: 237.40499997138977 sec
Example summarization on the test set:
True summarization:
[SOS] hannah needs betty’s number but amanda doesn’t have it. she needs to contact larry. [EOS]
Predicted summarization:
[SOS] [EOS]

Epoch 3, Loss 6.615131
Time taken for one epoch: 236.82289576530457 sec
Example summarization on the test set:
True summarization:
[SOS] hannah needs betty’s number but amanda doesn’t have it. she needs to contact larry. [EOS]
Predicted summarization:
[SOS] [EOS]

Epoch 4, Batch 123/231

jyadav202 · January 11, 2024, 9:08am

What was the output like for the 20th epoch?
Also, something tells me that you might be on an older version of this assignment. Can you please refresh it and try the latest version? You can do that by going to the Help option.

jondoff · January 11, 2024, 10:20am

Ok, I fixed it now. I made a mistake in that cell where the weights output was slightly different, but now it’s fixed and works. Thanks for the time

Topic		Replies	Views
C4W2_Assignment: Passed with 100, but Model Returns Only '[SOS]' from Exercise 5 Onward NLP with Attention Models week-2	1	17	January 2, 2025
C4W2 Assignment Exercise 5 - next_word NLP with Attention Models week-2	15	450	January 6, 2025
C4_W2 Text Summary Assignment: Model outputting only [SOS] NLP with Attention Models week-2	1	23	February 26, 2025
C4W1: EOS token has very low probability NLP with Attention Models week-1	11	162	July 18, 2024
Exercise 06 - UNQ_C9: pretrained model issue with next symbol NLP with Attention Models week-2	2	545	March 16, 2023

C4W2_Assignment NLP Transformer Summariser Issues

Related topics