All my samples are identical, but when I test my random.choice function outside the loop it gives different values. I can’t find my bug.
Here is my random.choice function:
idx = np.random.choice(range(len(y.ravel())), p = y.ravel()/y.sum())
Please help.
It should not be necessary to divide y by the sum, but it should do no harm since it is the output of softmax and the sum should be 1 or very close to it.
Have you actually looked at your y values? If they are all zero but 1 of them, it would have this effect.
Also check the dimensions of things. I added some print statements and here’s what I see:
vocab_size = 27
Wax (100, 27) x (27, 1) Waa (100, 100) a_prev (100, 1)
Wya (27, 100) x (100, 1) + by (27, 1)
y.shape (27, 1)
len(y) 27
len(y.ravel()) 27
type(y.ravel()) <class 'numpy.ndarray'>
One common error I’ve seen people make is that y ends up being 27 x 100, which will definitely send things off the rails.
Thank you for the quick reply.
I did check the dimensions - they are compatible with your print statements. Also my y values are non zero in many coordinates.
I’m not sure what else to check…
By the way I divided by y.sum because before I did it an error was raised that the probabilities don’t sum up to 1.
Found it !!
Thank you for helping.
Just curious: was the fact that the y values did not add up to 1 part of the problem? In other words you had not applied softmax correctly?
No. I just forgot to add one of the terms in the tanh() function in the calculation of a.