Dino Language - Random Choice using char_to_ix

Hello All,

While I was able to work through the assignment I’m not too sure what this note given in it actually means

Also notice, while in your implementation, the first argument to np.random.choice is just an ordered list [0,1,…, vocab_len-1], it is not appropriate to use char_to_ix.values() . The order of values returned by a Python dictionary .values() call will be the same order as they are added to the dictionary. The grader may have a different order when it runs your routine than when you run it in your notebook.

What do they mean by order here? Also why wouldn’t it work?

Cheers

An example of an ordered list is if you list the integers between 1 and 10 in sequential order:
1, 2, 3, … 9, 10.

An un-ordered list would be if you listed the same numbers but in a random order.

The np.random.choice() function requires that the values be given in order. This is so that the function can give the correct weighted probabilities for each value.

1 Like

Noted. So the motivation is this could be an issue if the vocabulary sequence doesnt match the one-hot sequence in X/Y?

Yes. If the vocabulary is built up by observing the data set, then the values could appear in any order.

2 Likes

All good points! The one other thing worth mentioning is that anytime you’re doing random sampling, you would not (in general) be expecting reproducible results. But we need to worry about that here just for convenience of grading and checking correctness of code by comparing “expected values”. That’s why we always set the random seeds to specific values in the assignments. You wouldn’t do that in a “real world” case …

1 Like