In the course, the ewa is initialized to v(0)=0, and v(i)=0.9v(i-1)+0.1theta(i). The problem is that the first few v will be very close to zero.
So in later video, there is a bias correction formula: vi/(1-beta**t)
My question is: why not set v(1)=theta(1), and v(i)=0.9*v(i-1)+0.1theta(i).
What’s the difference between this two kind of initialization. Why introduce such a complex correction formula instead of just set v(1)=theta(1)