Could you please help me avoid this error (see the attachment). How should I do to make sure that the probability vector y sums to 1?

This is how I computed the probabilities: y = softmax(z), and
how I sampled the indexes idx = np.random.choice(range(len(z.ravel())), p = y.ravel()).
I raveled both y and z to make sure they are of the same shape here (otherwise I was getting an error for the difference in their shapes)

The instruction in line 56 specified the sampling should be done within the vocabulary from the probability distribution y, so the first parameter for the np.random.choice() is range(vocab-size) and not range(len(z.ravel()). z is the activation output.

Thank you @Kic, I have corrected the error, but now receiving another error. Don’t know what I did wrong because my implementation of a is correct np.tanh(np.dot(Wax, x) + np.dot(Waa, a_prev) + b)