Dinosaurus_Island_Character_level_language_model - sample()

You’re right that the size of \hat{y}^{<t>} will be the size of your vocabulary, which consists of characters in this case. The second dimension is 1, meaning that it is a column vector. That’s because we frequently want to store multiple such entries in a matrix, by “stacking” the vectors as the columns of the matrix.

I added some print statements to my sample() function to print the various shapes and here’s what I see:

vocab_size = 27
Wya (27, 100) x (100, 1) + by (27, 1)
y.shape (27, 1)
len(y) 27
len(y.ravel()) 27
type(y.ravel()) <class 'numpy.ndarray'>