If you think β (hyper parameter for momentum) is between 0.9 and 0.99, which of the following is the recommended way to sample a value for beta?
I am confused in two options
beta= 1-10**(-r-1)
beta=0.09*r +0.9
The correct answer is 1st option but I am unable to understand what is wrong with 2nd option
Approach 1 uses log scale, and approach 2 linear scale. Andrew introduced both of them and recommended approach 1 in the C2 W3 lecture video titled “Using an Appropriate Scale to pick Hyperparameters”. Log scale tries values that are more different (in terms of the effect to the model training process) from one another than linear scale does. However, I would recommend you to go through that video again first.