Centered Moving Average

Francisco_Pereira · January 10, 2023, 4:11pm

Hi,

I would like to know if using the Moving Average Centered solution in order to predict new values it is not considered data leakage? Because, since we are averaging a middle number of a given series with values from the past and from the future we are indeed using information that it is not available when predicting.

Thank you

alvaroramajo · January 10, 2023, 4:22pm

Hi, @Francisco_Pereira !

That’s true. At inference time you just can’t do that. You can only use your past values for averaging your current point.

Christian_Simonis · January 10, 2023, 4:33pm

Hi @Francisco_Pereira,

welcome to the community and thanks for your question!

A moving average filter is called causal if the output does only depend on historic or present inputs, see Section 8.4.3.
This is usually the case when it comes to forecasting algorithms.

Let’s take an ARIMA approach for example. Here it’s fair to think in the following way for the time series prediction:

the prediction horizon is relevant. E.g. if you make a prediction today for July 2023, the prediction horizon is ~6 months
[a fair benchmark for this prediction horizon for this prediction benchmark should be considered, see also this thread].
when you evaluate your model performance, e.g. not before 6 months (let’s call this our test set) preferably also considering the benchmarks, there is no data leakage since no information from our test labels were available when the prediction was made.

Best regards
Christian

Francisco_Pereira · January 11, 2023, 4:13pm

Thank you for the answer. For what I’ve seen in the first week of Sequence Models course the instructor is using the centered rolling window. So we can assume, that in real world application that would be not possible since we do not have access to future data. Tell me if I’m missing something Thanks

alvaroramajo · January 12, 2023, 1:11pm

Exactly. If you were to deploy that model, you simply cannot use future data as it does not exist yet

Topic		Replies	Views
Trailing vs centered Moving Average Sequences, Time Series and Prediction week-4	1	899	August 17, 2022
Using `padding=casual` for time series prediction Sequences, Time Series and Prediction week-4	6	37	March 4, 2025
C4_W1_Lab_2_forecasting question Sequences, Time Series and Prediction week-1	3	410	September 27, 2024
C4_W1_Lab_2_forecasting Differencing not making sense to me Sequences, Time Series and Prediction week-1	1	600	May 5, 2022
Help on diff_moving_avg_plus_smooth_past Sequences, Time Series and Prediction week-1	5	626	August 2, 2022

Centered Moving Average

Related topics