Quention about the initialization of Exponentially Weighted Averages

In the course, the ewa is initialized to v(0)=0, and v(i)=0.9v(i-1)+0.1theta(i). The problem is that the first few v will be very close to zero.

So in later video, there is a bias correction formula: vi/(1-beta**t)

My question is: why not set v(1)=theta(1), and v(i)=0.9*v(i-1)+0.1theta(i).

What’s the difference between this two kind of initialization. Why introduce such a complex correction formula instead of just set v(1)=theta(1)

Hi, @qiu:

Sorry for the late reply :sweat:

The alternative you propose is perfectly fine, but whether you initialize to 0 or theta(1), the first values of the moving average will be biased towards this constant (this may be acceptable for your use case). The formula from the later video tries to correct this initial bias.

Did that make sense?

Hope you’re enjoying the specialization :slight_smile: