Question about the exponentially weighted average (Week 2)

nash · August 2, 2021, 7:10pm

Hi everyone,

This is a question about the intuition behind exponentially weighted averages.

From what I understood, the formula for an exponentially weighted average is a very useful tool to model trends by giving more or less weight to older and newer data. To give a full weight (i.e., 1.0) to the old data and consequently no weight to new data, you would have a model that doesn’t change at all (v1 = v2 = v3 = v4 = v5 = … = vn). Similarly, if you give full weight to new data, you’ll have a model that, using this term perhaps a bit liberally, is ‘overfitting’ the data (v1 = theta_1, v2 = theta_2, v3 = theta_3, etc). Finding a good distribution of weight for old and new data is therefore key in obtaining an accurate and general model for your data.

For the purpose of explaining my confusion, I’ll try to stick to the example Andrew went through in his video. I got confused when Andrew mentioned ‘averaging over x days.’ Specifically, the choice of 1/e as this threshold for when the exponential weight, 0.9^n for example, makes the datum ‘on that day’ (so to speak) no longer significant. To me, 1/e seems like an arbitrary choice, and therefore it’s hard to pin down just for how long the data is relevant. Or at least the choice of (1 - epsilon)^(1/epsilon), since this is what is actually used to obtain approximately 1/e, seems arbitrary. I understand how we obtain the number of days from this formula, but what I don’t understand is why this formula actually allows us to obtain a good estimate for the number of days we’re averaging over. I think the best way to phrase my question is: where did this formula come from? Maybe I missed something but there didn’t seem to be much of an explanation as to where this came from.

Thanks!

Raphael

nramon · August 4, 2021, 6:27pm

Hi, @nash.

As you already suspect, the choice of \frac{1}{e} is slightly arbitrary, but convenient. Here’s a more detailed explanation.

Let me know if that answers your question

Topic		Replies	Views
Some questions about the formula for exponentially weighted average in the programming assigment Improving Deep Neural Networks: Hyperparameter tun	2	546	May 11, 2021
Exponentially weighted Average Improving Deep Neural Networks: Hyperparameter tun	7	733	May 17, 2021
Understanding exponentially weighted averages, week 2 Improving Deep Neural Networks: Hyperparameter tun	3	1327	March 2, 2024
Exponentially Weighted Average Understanding Improving Deep Neural Networks: Hyperparameter tun	3	622	June 1, 2021
[Week 2] Exponential weighted average, Why we roughly approximate the last 1/(1-β) days Improving Deep Neural Networks: Hyperparameter tun	2	581	July 12, 2022

Question about the exponentially weighted average (Week 2)

Related topics