I don’t understand why we do uniform sampling on a logarithmic scale instead of the standard scale. Is there any mathematical basis for this? Thanks in advance!
Hi, @bgoyal.
As explained in this lecture, computing the exponentially weighted average is approximately equivalent to taking the average of the last 1/(1 - Beta)
days.
As you can see, 1/(1 - Beta)
is very sensitive to small changes in Beta
when Beta
is close to 1:
If you sample uniformly, more often than not you’ll end up exploring a small subset of the range of 1/(1 - Beta)
:
Instead, you want to sample more densely (the formulas were removed, but I hope the point is clear) in the regions where Beta
is closer to 1:
Let me know if that helped
Thank you for you explanation. This is helpful. I wonder what is the incentive of having to take samples from denser region over narrower region? If we choose beta closer to 1 does that mean we can do weighted average on many previous values.