What actually is Exponentially Weighted Average

I don’t understand What Exponentially weighted average is doing. I saw the lecture video 10 times but not understanding what it is.
Is it calculating the model for the values or is it predicting the current value based on the average of the last values using the formulae mentioned
vt=0.8vt-1 +0.1*theta_t

Please explain what actually it is doing?
I have implemented the concept and when I plot the values It is not correctly fitting the values and getting the wrong fit. But When Prof Ng is using the same formulae, Why is he getting a good fit, THE RED LINE IN THE GRAPH


Please tell what is wrong in my Implementation?


Hi @ajaykumar3456,

You are right,

It is a technique to smooth out your values, i.e., it is an average. It is an exponentially weighted average in this case since the effect of previous values on the current value declines exponentially.

Now, it is good that I completed your bias correction implementation in your last post
Bias Correction Implementation :slight_smile:

On Andrew’s slides, the red line uses bias correction.

Back to your example:

Check out the green line, which is a smooth version of the blue line (temperature). You can see that it doesn’t have the same variance (lower highs and higher lows).

If you use even more values, you will see that bias correction is not that important after a while since it catches up. The orange line will catch up with the green line with more data.

1 Like

Seriously, Thanks for your explanation @jonaslalin. But, When I reduce the beta value to 0.2, the both the Lines are matching without using Bias Correction.
My doubt is, Is our main Motto that the Line after Exponentially weighted Average and actual points line be similar or very close to each other?

And How is this concept related to updation of weights in back propagation.
And What is this Exponentially weighted average doing to my gradients.
Please Explain

Beta 0.2 means the weight for the current temperature is 0.8, which is why the orange line is catching up faster. The weight is so close to 1. However, usually, you want a smoothing factor closer to 0.8 or 0.9 when updating your weights using momentum, rmsprop, or adam. I recommend you check out the remaining videos for week 2 to better understand how it can be used in deep learning.

Start with

Andrew will explain how the exponentially weighted average will improve weight updates.

Exponentially moving averages will always lag the data source. It is more visible if you apply it to a graph of stock prices: