Confused about padding in next_symbol implementation

Armin_Moharrer · August 1, 2023, 1:06am

I am currently working on Exercise 6 of the assignment for Week 1. For implementing the next_symbol function, I see that the output tokens are padded so that the length of the list is a power of 2. Could you please explain what is the purpose of padding here? What would happen if we do not pad the list? It is my understanding the attention layer works with arbitrary sequences lengths.

Armin_Moharrer · August 1, 2023, 4:56am

After more thought, I realized that I was wrong: the sequence length for the attention layer has to be fixed. However, now my question is that how is the model trained on batches with different sequence lengths (which is the result of bucketing)?

arvyzukai · August 1, 2023, 5:39am

Hi @Armin_Moharrer

Sequences are truncated or padded if needed to some fixed length.

Cheers

Topic		Replies	Views
NLP Course 4, Week 1, Programming assign: Strange behaviour in "next_symbol" function NLP with Attention Models week-1	2	558	November 22, 2022
C4_W1_UNQ_C6 wrong ouput NLP with Attention Models week-1	3	522	March 26, 2023
C4_W1next_symbol padding NLP with Attention Models week-1	1	300	November 8, 2023
C4W1: UNQQ-C6: unittest.test_next_symbol failed NLP with Attention Models week-1	8	383	October 30, 2023
Which video of the week 1 is the reference for next symbol function? NLP with Attention Models week-1	5	675	October 30, 2023

Confused about padding in next_symbol implementation

Related topics