Not able to understand

not able to understand why * So, the shape of a mini-batch is (𝑛π‘₯,π‘š,𝑇π‘₯)* this shape is used in RNN.

First of all, please clarify where are you referring ? Video ? Assignment of which week ?

Assuming that you refer W2A1 assignment, here is the way to think about the dimension.
Of course, if you design the system, you can chose your preferable definition.

In this assessment, we handle the time series data. Its length is Tx. And, data x is one dimensional input vector. The length of the vector is n_x. We want to handle multiple data at one time. The size of this mini-batch is m. So, the input data is, of course, 3-dimensional array.
Then, the design is which should be which dimension.
As this is the time-series data, we want to process data at time t at one time. In this sense, it is very natural to set β€œtime” to the last dimension like this.

When we get a slice at tiime T_n, it will be 2D matrix, with the dimension reduction for the last axis. I think all agree this point.

Then, the next design point is which (n_x or m) should be the first dimension ? Actually, this is up-to your design. But, one design point is how we can perform all calculations efficiently.
In RNN cell, there are several matrix operation (dot product). If we see rnn_cell_forward(), a time slice of input data, x_t is used in here. It’s np.dot(W_{ax},\ x_t).

From a dot product view point, it is better for x_t to have one sample in a column, not row. (Otherwise, we need to β€œTranspose” it.).

So, the shape of each time slice is as follows.

In net, the input shape becomes (n_x, m, T_x).

Of course, this dimension definition is for this assignment. You can design any input shape with considering data format, slices, definitions of other variables (Shape, etc.), and performance.

2 Likes