Hi guys,

I am having truble understanding why the dimensions of Ypred are (Ny, m).

I understand explanations says, we need to enter a batch of values equivalent and hence the second dimension is always equal to m for Pred, Aprevious and Xt. But shouldnt Ypred be a single value after running the softmax?

I mean each ypred shall give us a single prediction for that time step rather than a batch of values. no?

Both X and Y have a “samples” dimension, right? It’s just that for convenience they choose to orient Y with the samples dimension last.

Also please note that you filed this question under “General Discussion”. I have moved it for you to DLS Course 5 by using the little “edit pencil” tool by the title.

ok softmax normally outputs a vector with probability for every possible class for each input example. However, here it is just 1 value for each Ypred for every example in the batch, so is it average value here?

Where does it say that there is only one value per sample? The first dimension is n_y, right?

Ok got it. Ny is 2 and when i sum ypred along column dimension, it rightly does so like softmax out put to 1. Thanx