Emojifier-V2 Hyperparameter Tuning

In the function Emojify_V2(), we set LSTM’s number of units to 128 and Dense layer number of units to 5.

How is the number 128 derived? How is the number 5 derived? Is the number 5 the result of 5 emoji types used?

Thanks in advance.

The size of the hidden state for any kind of sequence model (RNN, GRU, LSTM) is just a choice you make. If you pick too big a number it may slow down your training a bit, but a bit of “overkill” is probably not a big problem.

For the Dense layer output, they tell you in the instructions for that section that it’s a 5 class softmax output:

* The model outputs a softmax probability vector of shape (m, C = 5).