In the above screenshot, an alpha value is picked as
α = 9.9e-7
But how do we even know where to begin with an “alpha”? Any pointers?
Thank you
In the above screenshot, an alpha value is picked as
But how do we even know where to begin with an “alpha”? Any pointers?
Thank you
Hi there,
I am assuming the alpha here is the learning rate! The learning rate is a hyper-parameter of your model, there are no fixed guidelines on choosing a certain value, it is a parameter that you tune as training and validation testing of your model goes on.
It is advisable to start with low values or with values that similar models use. The smaller the value then the lesser the risk of over-jumping an optima of your model. Similar models that have performed well have already choose a good learning rate that is why they are a great starting point too.
Tensorflow uses callbacks to change the learning rate as the training happens side by side, so a suitable value is then used for a lengthy training process or could be used to “stop early”. There are also different search methods to search an optimal learning rate such as: random search, grid search or Bayesian search, you can search about them on the web.
Thank you for the detailed explanation @gent.spah