W1 A1 dimensions of n_a and n_y?

jakhon77 · December 12, 2024, 11:10pm

I understand how to determine the size of n_x. If the input x is a 5000-dimensional vector, then n_x would also be 5000. However, how do we determine the sizes of n_a and n_y for both Basic RNN and LSTM? Is it just a random choice? Can I select any number I want?
Thank you so much in advance!

paulinpaloalto · December 13, 2024, 12:05am

It depends on understanding the meaning of those dimensions. n_a is the size of the “hidden state” of the RNN cell. In the case of a GRU or LSTM, that state has more than one component, but the base hidden state is a. Choosing the size of that is a hyperparameter choice that is analogous to choosing the number of layers and sizes of layers in a FCN or filter sizes in a CNN. You want the hidden state to be complex enough to learn the details of state to perform whatever the task is of your RNN. But if it’s too big, then that just costs more to run the training. As with other hyperparameter choices, we can start by studying worked examples that have been successful in the past at problems that are at least somewhat similar to what we need to accomplish with a new RNN that we are designing.

Then n_y depends on what the output is for your RNN. There are lots of different types of RNNs that Prof Ng shows us in Week 1 and we’ll see even more as we go through the rest of C5. For example, if you are predicting words from a vocabulary, then it will be one hot vector the dimension of the size of the vocabulary. But it all depends on what your output looks like. You’ll see several examples in the exercises in W1. In the Dinosaur Names exercise, we are predicting letters plus a few delimiters, so it’s 26 or 28 I think. In the Jazz Improvisation assignment, the output is musical notes chosen from a scale with 90 notes, if I’m remembering correctly.

jakhon77 · December 13, 2024, 12:40am

Awesome! Thank you for your clarification!

Topic		Replies	Views
Hidden states of LSTM cells Sequence Models	5	558	October 11, 2021
Determine size of n_a Sequence Models	1	504	March 10, 2022
Understanding RNN Cells: Dimensions and Initialization Queries Sequence Models	3	447	December 31, 2023
W1A2 - How Are Shapes Determined Sequence Models week-1	1	122	May 25, 2024
RNN dimensions for hidden state and output NLP with Sequence Models week-2	3	682	January 13, 2023

W1 A1 dimensions of n_a and n_y?

Related topics