For the W1A2 Dinosaur assignment, I have a lot of questions about the shapes of parameters (e.g. Waa, Wax, Wya) and activation a used in the function sample().
The activation a’s (i.e. a_prev variable) shape is (100, 1). I know the number 100 is set in the sample_test() function (for the variable n_a). But what is the variable n_a?
Parameter Wax’s shape is (100, 27), parameter Waa’s shape is (100, 100), while parameter Wya’s shape is (27, 100). The number 27 is due to the 26 alphabets plus the newline character. But, where does the number 100 come from? Or a better question is how are the parameters shapes being constrained?
Thank you in advance
I did this drawing to help me understand all the shapes sizes in a RNN cell. Just remember that when you multiply 2 matrixes, the column of the first matrix must be equal to line of the second. Ex: A.shape (m, n) B.shape (n, t) => np.dot(A, B) will result in a shape (m, t).
Regarding to the value n_a, it is a hyperparameter. ‘a’ is the hidden state, thus the size of it in a Recurrent Neural Network (RNN) is an important hyperparameter that needs to be chosen carefully. This size, often referred to as the number of hidden units or the hidden dimension, depends on some factors, such as complexity of tasks, size of input data, among others.
I hope that answer your question.
Weberson.
3 Likes