W2 Assignment clarification - "vocab_size"

My understanding is that we are building a character prediction model, not a word prediction model. Yet some of the text in the W2 assignment seems to apply to word prediction models. E.g. below references vocabulary size. For this assignment would it be correct to replace that with the number of possible characters, which I believe is 256?

  • tl.Embedding: Initializes the embedding. In this case it is the size of the vocabulary by the dimension of the model.
  • tl.Embedding(vocab_size, d_feature).
  • vocab_size is the number of unique words in the given vocabulary.
  • d_feature is the number of elements in the word embedding (some choices for a word embedding size range from 150 to 300, for example).

def GRULM(vocab_size=256, d_model=512, n_layers=2, mode=‘train’):

@John8 we are using characters to predict the next words in the squence. After “Thank” the next word is mostly going to be “you”.

Thanks for the answer. I need to go back and review the lab with that context in mind.

Hi @John8

Yes, you are correct, it’s a mistake:

…is the number of unique words characters in the given vocabulary.

(probably copied from the documentation).

In Course 3 Week 2 we are predicting characters, so if you have “Thank” the next most probable character would be " " (space), then you feed that "Thank " and that should produce the outcome “Thank y” and so on.

1 Like

Okay, thank you. That clarifies things for me.