Hi,
I was working through week 1โs programming exercises and I realized I wasnโt sure how n_a is determined. The size of activations for RNNs is defined as (๐๐,๐,๐๐ฅ) but how is na determined?
Hi,
I was working through week 1โs programming exercises and I realized I wasnโt sure how n_a is determined. The size of activations for RNNs is defined as (๐๐,๐,๐๐ฅ) but how is na determined?
The size of the activations is a โhyperparameterโ, meaning that it is simply a choice you need to make as the system designer. You need to choose a value that captures the complexity of the โstateโ that you need to track in the nodes of your RNN. The way such choices are made is by experience and intuition. Then you check whether your choices are good by the performance of your resulting trained model. If you choose too small a value, the model may not perform very well (โunderfittingโ). If you set it too large, then it is more costly to train your network.