I verified with print statements after the generate_next_token call in Exercise 5 that next_token is indeed never eos_id, so that done is never (or very rarely) set to True.
Note that this issue/problem (not exactly an error) was not caught by either the unit tests or the grader, which gave me 100%. Any idea what’s going on?
My best guess is that there’s some problem with either the training data or my model implementation that’s preventing the model from seeing the EOS token during training. I suspect this is because the EOS token either is or isn’t somewhere that it shouldn’t or should be.
Ensure that the EOS token is correctly included in your training sequences. Each sequence should have an EOS token at the end.
Also, this issue happens when your model only chooses the most probable ones. For this, double-check the logic in your generate_next_token function and how you append tokens to the generated sequence.
Hope it helps! Feel free to ask if you need further assistance.
I just used the training data as provided, because it’s not suggested anywhere that we modify it at all. Would that be the issue?
“Also, this issue happens when your model only chooses the most probable ones”–in general, sure, but clearly not the issue here. There is no way that in a properly trained model, the most likely token after “eu adoro idomas” (translating “I love languages”) is anything other than the EOS token, even using greedy selection and zero temperature. It is certainly not “eu” as in the first screenshot (“I love languages I”).
Also, the generate_next_token function is provided and not editable.
I don’t have access to the course notebooks, but adding <s> (Start of Sentence) and </s> (End of Sentence) tokens could help standardize the training data. Then, the model properly learns where sentences begin and end, reducing the likelihood of inappropriate token predictions.
For more specific assistance, kindly share your code with me in Private Messages!
Convert the original string into a tensor, you are support convert the texts and not text to tensor as it mention original string
Same goes for next line of code where you vectorise, you need to use texts
Remember for both the above two codes you using only text means text (string): The sentence to translate, but the instructions mentions to original string, hence texts
Hi, I have encountered the same problem. Struggling to correct…
By “you are support convert the texts and not text to tensor” do you mean in the beginning of the translate function definition? I don’t see any reason that is different from the case where every “texts” is “text”
I also don’t understand why “for i in range(max_length)” will not work since the name “token” is not used in the loop and technically I should be able to choose any label for this iteration variable, like “_”.
I change them anyway and the result is still missing eos_ids TUT.
@VictorAildom, I did not understand @Deepti_Prasad’s comments either. I eventually gave up and moved on since neither the unit tests nor the grader were noticing the problem.
I have also come across a similar issue. The translator from exercise 5 seems to fail to recognise the EOS token. I have tried debugging in multiple ways having gone through the suggestions from different posts in this forum. Still can’t get my head round why the generated output keeps repeating itself.
(My work passed all the unit tests and the grader returns with 100% as well.)