Week 1 Dinosaurus Island random.choice

Hello, I’ve tried a million things but can’t seem to figure out the syntax for this. I keep getting “only integer scalar arrays can be converted to a scalar index” or other errors. If I just randomly choose the index my code runs (but then I get assertion error on the values and thus can’t even get a nonzero a less-than-perfect grade).

np. random.choice(arg1, arg2)

If arg1 is an array, the function will sample randomly from arg1’s element.
Given that the sampling is to be taken from the vocabulary, so arg1 would have to be an array of (vocab_size,1).

To create such array, we could use:

arrange(vocab_size)

Thanks, but when I plug in arrange(vocab_size) for arg 1 the compiler tells me:
name ‘arrange’ is not defined.

Sorry, it should be

range(vocab_size)

OK, thanks. FYI there’s a typo in the instructions here:
Example of how to use np.random.choice() :

np.random.seed(0)
probs = np.array([0.1, 0.0, 0.7, 0.2])
idx = np.random.choice(range(len(probs), p = probs)

There’s a missing ‘)’ in the last line.

So now even though my code runs, the checker still reports wrong values. The second argument I am putting for random.choice is y[:,counter]. Is this incorrect?

Thanks again,
Gabe

Hi @Gborden1

The notes for step 3 said:

  • Note that the value that’s set to p should be set to a 1D vector.
  • Also notice that 𝑦̂ ⟨𝑡+1⟩, which is y in the code, is a 2D array.

The probability distribution y (in the code) is a 2D array, we need to make it into 1D vector, to do that we use:
y.ravel()

So, for arg2, it will be
p=y.ravel()

Thanks for alerting us the typo in the notes.

1 Like

Thanks–but with that change, my code no longer runs, gives error: ‘a’ and ‘p’ must have same size

I initially tried to use ravel as the notes said, but my y has shape [27,100]. As I understand it, I need the second argument to just be [27].

Hi @gborden1,

The y shape of size [27,100] is incorrect, so you need to check if you have followed the formula and passing the correct parameters for calculating z.

Hi @gborden1,
I have coded the np.random.choice with two arguments as specified in your email,
but I get the error: ValueError: probabilities contain NaN
It took me a couple of days but can’t figure out why I got the error.

Hi @MinhPham ,

What are the first and second parameter you passed to np.random.choice()?

Hi @Kic,
The first parameter is range(vocab_size), the second parameter is p = y.ravel().

1 Like

hi, i recevied the error, because the shape is y is [27,100].

here’s how i calculated:
{mentor edit: code removed}

what’s wrong with it?

Hi @TTsang ,

The code looks fine. To trace the problem, use print statement for debugging. Here, you can use print statement to check the shape of x and a_prev, print the value of Wax, Waa.

Hi @MinhPham ,

The parameters passed to np.random.choice() are correct. The probability distribution referring to is y, so you need to check the calculation of y. Use print statement for debugging, print out the shape of x, a_prev, and the value of Wax, Waa, a and z.

Hi @Kic,
Thanks for your suggestion. I still don’t why the error occurs.
I extract the code to test in python3, command by command until the random.choice command, i also extract data in the sample_test to ensure the shape of x, a_prev are correct, the values of Wax, Waa are correct (in the test) and a and z are calculated correctly. Then I apply the softmax function on z to calculate y, i got all results of y correctly because their sum is equal to 1.0 and there is no NaN probability at all!

I put the stack tracing from the assignment:
ValueError Traceback (most recent call last)
in
19 print("\033[92mAll tests passed!")
20
—> 21 sample_test(sample)

in sample_test(target)
7
8
----> 9 indices = target(parameters, char_to_ix, 0)
10 print(“Sampling:”)
11 print(“list of sampled indices:\n”, indices)

in sample(parameters, char_to_ix, seed)
56 # Step 3: Sample the index of a character within the vocabulary from the probability distribution y
57 # (see additional hints above)
—> 58 idx = np.random.choice(range(vocab_size), p = y.ravel())
59
60 # Append the index to “indices”

mtrand.pyx in numpy.random.mtrand.RandomState.choice()

ValueError: probabilities contain NaN

If this message alone mtrand.pyx in numpy.random.mtrand.RandomState.choice(), could it be a memory issue? I am not sure.

Hi @MinhPham ,

It could be that the kernel became inactive, causing the execution environment not having the correct references. So try refresh the kernel and clear all output, rerun the code from start.

Hi @Kic,
I use the option “restart kernel and clean output” a couple of times and could not resolve the issue.

Hi @Kic,
I compare 2 functions Sample_test(target) and optimize_test(target)
Optimizie_test(target) initializes vocab_size = 27

while Sample_test(target) does not initialize vocab_size, when I run in python3, I had to intialize vocab_size otherwise the code does not work. Is the issue is the missing of initialization of vocab_size in Sample_test?

Maybe the problem is your initialization of a_prev?

1 Like

Hi @MinhPham ,

In def model(), the default value for vocab_size is set to 27. This is one of the python’s feature. So the problem is not in vocab_size. I think there is something minor. If you put that section of the code in a DM to me, I’ll have a look for you.