C4W2_Assignment: Passed with 100, but Model Returns Only '[SOS]' from Exercise 5 Onward

Hello everyone,

I’ve completed the final assignment for this week (finally, whoop! :blush:). All the tests passed, and I finished the assessment with a score of 100.

However, starting from Exercise 5 – next_word, the model only returns the token [SOS], and this issue persists for the rest of the exercises. The outputs for all the exercises before Exercise 5 were as expected, so the problem seems to begin at this point.

Here’s what the output should look like in Exercise 5:

Predicted token: [[14859]]  
Predicted word: masses  

But instead, I get:

Predicted token: []  
Predicted word:  

Additionally, when I run this test:

training_set_example = 0

# Check a summary of a document from the training set
print('Training set example:')
print(document[training_set_example])
print('\nHuman written summary:')
print(summary[training_set_example])
print('\nModel written summary:')
summarize(transformer, document[training_set_example])

Here’s what I get:

Training set example:
[SOS] amanda: i baked  cookies. do you want some?  jerry: sure!  amanda: i'll bring you tomorrow :-) [EOS]

Human written summary:
[SOS] amanda baked cookies and will bring jerry some tomorrow. [EOS]

Model written summary:

'[SOS]'

Additionally, when I train the model using this loop:
with this code:

# Training loop
for epoch in range(epochs):
    ...
    print(f'Predicted summarization: {summarize(transformer, true_document)}')

The true summarization remains correct, but the predicted summarization is always:

[SOS]

This persists across all epochs, as shown below:

  • Epoch 1 to Epoch 20: The predicted summarization remains [SOS], despite the loss steadily decreasing.

While I can proceed to the next week, after putting in so much effort, I’d really like to see a fully functioning transformer model.

I’d appreciate any guidance or suggestions on how to resolve this issue.

Thank you!

hi @eliya

I understand that whoops :joy:

by any chance in your predicted next word, you didn’t mix local variables with global variables??

See if the below comment thread helps you

Regards
DP