Why y for the probability distribution

danielwu3 · May 16, 2021, 5:35am

In week 1 dinosaur assignment, the sample() function step 3 comment says:

Step 3: Sample the index of a character within the vocabulary from the probability distribution y

Why is the distribution y the right thing to do? I understand that we want to make sure the next letter selected won’t be the same so we want to use something for distribution. But why y? I don’t understand.

Kic · May 16, 2021, 3:05pm

Y is the probability distribution of the next character indexed by i. To help ensure the next character is not the same, the function np.random.choice is used to generate the next i randomly.

danielwu3 · May 17, 2021, 2:55am

@Kic Thanks, I got what the instructions say it is. However, y is the softmax of z. I understand that y<t+1> is the prediction of the letter. How can it be a probability too?

Kic · May 17, 2021, 10:56am

Hi @dabuekwu3,

In the 6th video of week1, Language Model and Sequence Generation, Prof. Ng explained that the job of a RNN langugage model is to predict the probability of a word given the previous word/words. So, for y-hat of certain time stamp, the output of the softmax function is the probability of a particular word picked from the corpus/dictionary that satisfies P( Y-hat / y1, y2) where y1 and y2 are preceding words in the sentence.
If you go back to an earlier video, where Prof. Ng. talked about the ‘Apple and pear salad’ example, you would remember that he gave two sentences sounded the same:

    Apple and pair salad
    Apple and pear salad

How did the model choose which one to pick? By looking at the probability of the softmax output, where the second sentence has a higher probability than the first, so the second sentence was chosen.
You will find revisiting those videos helpful to reinforce these concepts.

Topic		Replies	Views
Language Model and Sequence Sequence Models	1	501	August 7, 2022
Sampling Novel Sequences Sequence Models	6	528	January 13, 2023
Point of clarification for video "Neural Network with Softmax" Advanced Learning Algorithms week-2	4	396	July 10, 2023
Week 1 Sampling Novel Sequences Sequence Models	4	595	June 20, 2024
RNN for speech recognition Sequence Models	4	524	March 15, 2023

Why y for the probability distribution

Step 3: Sample the index of a character within the vocabulary from the probability distribution y

Related topics