how the dimension for activation a_next is decided …like the “n_a” one ?
Hi ADITYA_GANESH_KHEDEK,
Usually this is done based on trial-and-error using grid search. Have a look at this blog post.
If you are looking for a more analytical approach, you can have a look at this paper.