How n_a is decided for a_next

how the dimension for activation a_next is decided …like the “n_a” one ?


Usually this is done based on trial-and-error using grid search. Have a look at this blog post.

If you are looking for a more analytical approach, you can have a look at this paper.