I need some help in understanding Week 1, Exercise 2 sampling
When doing forward pass, shape of y is (27, 100). Why should y be 2D array? If y is the probability of an index, it should be a 1D array of shape 27.
I need some help in understanding Week 1, Exercise 2 sampling
When doing forward pass, shape of y is (27, 100). Why should y be 2D array? If y is the probability of an index, it should be a 1D array of shape 27.
I have the same question, just to make sure I understand - at each time step, y should still just be an 1D array correct? One character (softmax probability) is the output of RNN at each timestep?
So if y is 27x100, is that for time steps (or T_y) for 100?
@paulinpaloalto Thank you very much.
My issue was incorrectly initializing a_prev and x as 1-D arrays instead of 2D arrays eg (27,) instead of (27,1). This caused numpy to cast the other columns, which resulted in y.shape=(27,100) in each time step! I have since fixed it but I think @vijayst might have faced a similar issue if y shape within the while loop is showing (27,100).
My apologies: I misunderstood what the original question was about. I thought you were talking about y values in general. In the specific case of the Dinosaurus Island Assignment, exercise 2, the sample function, it is the case that you’re doing one step at a time. The 100 value there is the size of the “hidden state”, not the number of steps. The output y value in each iteration of the loop should be 27 x 1. If you end up with 27 x 100, it means your code is wrong. Check the “dimensional analysis” on the main formulas being implemented there. I added print statements to show the dimensions:
Wax (100, 27) x (27, 1) Waa (100, 100) a_prev (100, 1)
Wya (27, 100) x (100, 1) + by (27, 1)
y.shape (27, 1)
len(y) 27
len(y.ravel()) 27
type(y.ravel()) <class 'numpy.ndarray'>
So if you dot 100 x 27 with 27 x 1, you get 100 x 1.
Then you dot 100 x 100 with 100 x 1 and get 100 x 1 again and adding 100 x 1 with 100 x 1 stays 100 x 1.
Then you dot 27 x 100 with 100 x 1 and end up with 27 x 1 which is the shape of the answer.
So you need to figure out where in the above sequence you go off the rails if you don’t end up with 27 x 1.
Thank you @paulinpaloalto @nvarma I understood it better. And yes, the problem was initialising with a 1D vector instead of a 2D array.