Run the following cell to create your model and check its summary.

- Because all sentences in the dataset are less than 10 words,
`max_len = 10`

was chosen.
- You should see that your architecture uses 20,223,927 parameters, of which 20,000,050 (the word embeddings) are non-trainable, with the remaining 223,877 being trainable.
- Because your vocabulary size has 400,001 words (with valid indices from 0 to 400,000) there are 400,001*50 = 20,000,050 non-trainable parameters.

Hi @krithika_govindaraj

The model uses word embeddings with a vocabulary size of 400,001 words, each represented by a 50-dimensional vector, resulting in 20,000,050 non-trainable parameters for the embeddings (400,001 * 50). The total number of parameters in the model is 20,223,927, meaning 223,877 parameters are trainable, which include weights and biases (W, b) of other layers in the model. The choice of `max_len = 10`

is because the dataset sentences have a maximum length of 10 words. In this setup, the embeddings are fixed and not updated during training, and focuses on training the remaining layers.

Hope this help, feel free to ask if you need further assistance!

2 Likes