Error in Dinosaur programming assignment exercise 4

I get an error in the optimization call. I am guessing that either my X or Y is not formed correctly but I am not sure how to fix it. I am forming both by:

In the case of X, I am adding [None] to single_sample_ix

In the case of Y, I am adding single_sample_ix to [ix_newline]

Here is the error:

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-31-725c093d6b91> in <module>
----> 1 parameters, last_name = model(data.split("\n"), ix_to_char, char_to_ix, 22001, verbose = True)
      2 
      3 assert last_name == 'Trodonosaurus\n', "Wrong expected output"
      4 print("\033[92mAll tests passed!")

<ipython-input-30-61c3d62593d9> in model(data_x, ix_to_char, char_to_ix, num_iterations, n_a, dino_names, vocab_size, verbose)
     61         # Perform one optimization step: Forward-prop -> Backward-prop -> Clip -> Update parameters
     62         # Choose a learning rate of 0.01
---> 63         curr_loss, gradients, a_prev = optimize(X, Y, a_prev, parameters, learning_rate = 0.01)
     64 
     65         ### END CODE HERE ###

<ipython-input-18-229d16b0dfd3> in optimize(X, Y, a_prev, parameters, learning_rate)
     32 
     33     # Forward propagate through time (≈1 line)
---> 34     loss, cache = rnn_forward(X, Y, a_prev, parameters)
     35 
     36     # Backpropagate through time (≈1 line)

~/work/W1A2/utils.py in rnn_forward(X, Y, a0, parameters, vocab_size)
    100 
    101         # Update the loss by substracting the cross-entropy term of this time-step from it.
--> 102         loss -= np.log(y_hat[t][Y[t],0])
    103 
    104     cache = (y_hat, a, x)

IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices

Yes, this is frequently a problem for people. I’ve added some print statements in my model() code to show what it should look like:

initial loss 23.070858062030304
len(X) 12
type(Y) = <class 'list'>
Y = [20, 21, 18, 9, 1, 19, 1, 21, 18, 21, 19, 0]
len(Y) 12
 X =  [None, 20, 21, 18, 9, 1, 19, 1, 21, 18, 21, 19] 
 Y =        [20, 21, 18, 9, 1, 19, 1, 21, 18, 21, 19, 0] 

j =  0 idx =  0
single_example = turiasaurus
single_example_chars ['t', 'u', 'r', 'i', 'a', 's', 'a', 'u', 'r', 'u', 's']
single_example_ix [20, 21, 18, 9, 1, 19, 1, 21, 18, 21, 19]
 X =  [None, 20, 21, 18, 9, 1, 19, 1, 21, 18, 21, 19] 
 Y =        [20, 21, 18, 9, 1, 19, 1, 21, 18, 21, 19, 0] 

Iteration: 0, Loss: 23.087336

So you can see that I’ve added the token None as the first element of the list.

Note that None is a python token and a totally different thing than "None", which is a python string.

Got it! My compositions of X and Y were ok, it was my syntax on single_sample_ix that was wrong. I had never used python composition before. Thanks!

That’s great news that you were able to get to a solution based on that approximate hint!

Thanks! This final course is definitely more demanding. Three programming assignments this week with trickier programming techniques. But RNNs are very interesting!

C5 starts out difficult, and gets tougher from there.

Indeed, but it’s totally worth it. As you say, the material is really interesting and highly relevant. It culminates with the core tech that underlies all the current LLMs.