How to compute RNN parameters?


Hello, given the attached picture of an RNN , how would we compute the number of parameters in the RNN cell?

HI @Zineb_Attaoui ,

I understand that you want to compute the number of parameters of a RNN, that is, the total count of parameter of a RNN.

In a recurrent neural network RNN we have to compute the number of parameters of the input layer, plus the number of parameters in the hidden layer.

In the input layer, the number of parameters comes from the size of the vocabulary, say ‘m’.

In the hidden layer, the number of parameters comes from the number of units of the layer, say ‘n’

Since the input is m and the hidden layer has n units, the W params of the input layer would be m * n.

Input weights: m*n

Since the network is recurrent, meaning it ‘calls’ itself over and over, then the W parameters in the hidden layer will be n * n.

Recurrent weight: n*n

If we add biases, then we would have ‘m’ biases for the input layer and ‘n’ biases for the hidden layer.

To compute the total number of parameters we add (m * n) + (n * n) + m + n

1 Like

Juan has covered how to compute the total number of parameters in the RNN cell, given that you know the size of the inputs and the number of elements in the hidden state of the cell. If the question is how do we determine the number of elements in the hidden state, the answer is that it is what Prof Ng calls a “hyperparameter”, meaning a value that you simply have to choose as the system designer. If you choose too small a value, the network may not perform very well. If you choose too large a value, it may cause the cost of training the network to be more expensive than it needs to be. So the goal is to pick a “Goldilocks” value that is just right. If you have err, it would be better to err on the side of slightly too large a value.

2 Likes

Hi,

I am trying to understand pictorially, how these “Hidden SimpleRNN” units are connected.

Is below picture represents hidden units correctly? Top image shows generic form and below that time unrolled.

Thanks,
Aravind