I don’t understand why we do uniform sampling on a logarithmic scale instead of the standard scale. Is there any mathematical basis for this? Thanks in advance!

Hi, @bgoyal.

As explained in this lecture, computing the exponentially weighted average is approximately equivalent to taking the average of the last `1/(1 - Beta)`

days.

As you can see, `1/(1 - Beta)`

is very sensitive to small changes in `Beta`

when `Beta`

is close to 1:

If you sample uniformly, more often than not you’ll end up exploring a small subset of the range of `1/(1 - Beta)`

:

Instead, you want to sample more densely (the formulas were removed, but I hope the point is clear) in the regions where `Beta`

is closer to 1:

Let me know if that helped

Thank you for you explanation. This is helpful. I wonder what is the incentive of having to take samples from denser region over narrower region? If we choose beta closer to 1 does that mean we can do weighted average on many previous values.