Week 1 Assignment 2 Exercise 2 - sample

Oleksandr_Semenov · December 6, 2021, 8:32pm

Hello there,

I have an issue with this exercise. My code gets only first 4 indexes right
[23, 16, 26, 26, 19, 25, 4, 1, 15, 6, 4, 25, 23, 23, 10, 14, 16, 12, 11, 3, 20, 16, 22, 23, 14, 15, 6, 9, 26, 2, 6, 1, 26, 11, 2, 21, 0]
instead of
[23, 16, 26, 26, 24, 3, 21, 1, 7, 24, 15, 3, 25, 20, 6, 13, 10, 8, 20, 12, 2, 0]
It also appears to output more indexes that expected.

Trying to debug this code, I want to verify if my assumptions are correct:

x is vector of the same size as the dictionary - 27
y is of the shape (27, 100)
each of 100 entries add up to 1, which I understand is a probability for each of 100 characters in the word
the input to np.random.choice is a vector of length 27. If I pass all 100 vectors at once I get an error that probabilities do not add up to 1. How do I chose which of 100 vectors goes to np.random.choice function? I tried using idx and counter.
In step 4 why do we have x = None and x[idx] = None. Isn’t x should be set to y? Looking at the Figure 3 x<t+1> is y. So why do we want to update particular element in the vector and what it should be set to? I tried setting it to idx but then I only get one index right.

Thank you,
Alex

paulinpaloalto · December 6, 2021, 11:14pm

The shape of y is incorrect: it should be (27, 1). So it would be a good idea to check your calculations there. I added prints to my code to show the shapes of everything and here’s what I get:

vocab_size = 27
Wax (100, 27) x (27, 1) Waa (100, 100) a_prev (100, 1)
Wya (27, 100) x (100, 1) + by (27, 1)
y.shape (27, 1)
len(y) 27
len(y.ravel()) 27

Oleksandr_Semenov · December 7, 2021, 2:45pm

thank you @paulinpaloalto! I did fix that issue and now y is (27,1). However, it didn’t fix the output. I still only get first 4 indexes right. I wonder if that’s because I update x incorrectly. My understanding that x should be set to y (x = y). I don’t understand why we need to complete x[idx].

paulinpaloalto · December 7, 2021, 4:23pm

It is the array indices that is the actual answer. At each iteration, we take the input x (0 for the first iteration) and create the x for the next time step. It’s not directly equal to y, but is a random choice based on y to make it more interesting (less predictable). The other point about setting x[idx] = 1 is just that x is formatted as a “one hot” vector, right?

Oleksandr_Semenov · December 7, 2021, 4:46pm

I set x to 0 vector of form (27,1) and x[idx] to 1 and it worked for me. Thank you! My understanding is that x in one hot encoding form carries over to the next step index of the predicted symbol in the previous step. Those setting idx to 1 tells next cell that this is the index that was selected in the previous step end everything else wasn’t those it is 0.

paulinpaloalto · December 7, 2021, 5:08pm

Yes, that is a good description of how one hot encoding works.

Roger_Krimstock · December 7, 2021, 7:29pm

I have a similar problem, sampletest() finds that sample() produces different values than expected:
list of sampled indices:
[17, 13, 26, 23, 24, 19, 7, 17, 7, 17, 15, 3, 26, 8, 18, 18, 24, 1, 17, 14, 11, 10, 21, 22, 0]
list of sampled characters:
[‘q’, ‘m’, ‘z’, ‘w’, ‘x’, ‘s’, ‘g’, ‘q’, ‘g’, ‘q’, ‘o’, ‘c’, ‘z’, ‘h’, ‘r’, ‘r’, ‘x’, ‘a’, ‘q’, ‘n’, ‘k’, ‘j’, ‘u’, ‘v’, ‘\n’]

I noticed that the documentation above the code, just after step 3, mentioned the function ravel(), which I did not use. None of the objects in step 3 (or beyond) seem to need ravel(). I wonder what the documentation has in mind.
Thanks!

paulinpaloalto · December 7, 2021, 7:41pm

You need to make sure that the “probability distribution” argument to np.random.choice is a 1 dimensional object. np.ravel is one way to do that, but there are plenty of others. I tried using np.squeeze instead and the tests pass just fine with that. Note that y is a (27, 1) 2D array. If you just use that directly as p, it throws an error.

So my guess is that absence of ravel in your solution is not the issue. There must be something else wrong with your logic. Please check everything against the instructions again and the comments that give directions.

Roger_Krimstock · December 7, 2021, 9:17pm

Thanks! I used squeeze, not ravel, and the function just simply wasn’t working. Finding out that ravel or squeeze works the same, I looked much more closely at the code, and found that I had a spurious application of “tanh” in step 2, the statement setting “z”. It’s working now.

ORIOL_RAVENTOSMORERA · June 23, 2022, 1:57pm

A thing to keep in mind (I keep forgeting) is to reshape the zero vectors, e.g. np.zeros(3).reshape((3,1)).

paulinpaloalto · June 23, 2022, 4:21pm

Note that np.zeros takes a “tuple” as an argument, so you could have achieved the same result more simply this way:

np.zeros((3,1))

Dennis_Sinitsky · January 27, 2024, 7:40pm

Thanks Paul! Keeping track of dimensions is always the hardest for meand this was a key hint. And a extra hint to reader: when initializing arrays, do np.zeros((M,1)), not np.zeros(M). This will save some debugging time!

paulinpaloalto · January 27, 2024, 11:59pm

That’s a great point! The difference between a 2D array and a 1D array is key here.

Topic		Replies	Views
Course 5, Week 1, Assignment 2, Exercise 2: Question about step 3-sampling Sequence Models	6	617	November 22, 2021
C5w1 - Assignment 2 - Exercise 2. The sampling Sequence Models	12	735	February 27, 2025
Week 1 Dinosaurus Island random.choice Sequence Models	32	1353	November 12, 2022
Week 1, Assignment 2, Random Choice Sequence Models	6	532	December 11, 2021
Week 1 Assignment 2 - Sample Sequence Models	5	563	December 21, 2021

Week 1 Assignment 2 Exercise 2 - sample

Related topics