Using `padding=casual` for time series prediction

dsfasfuqwjoasjsad · October 31, 2024, 4:53pm

Hello mentor, the following parts of week 4 lab1 states the necessity for setting padding=casual to prevent data leakage.

I have question on if we must do it here. To illustrate, if we want to predict value at t_{30} and window_size=30, we should pass the sequence x_0 = [y_{t=t_0}, y_{t=t_1}, ... y_{t=t_{29}}] to NN model. Since all the time stamp in x_0 are before t_{30}, no matter how the CNN layer pad the values in x_0, there is no data leakage. Would you make some comments on my thought ? Thanks.

lukmanaj · November 15, 2024, 4:52pm

Unless there’s a compelling performance or computational reason to omit it, I suggest using padding="causal". It provides an additional layer of protection against potential issues and ensures the model adheres to best practices for time series forecasting.

In your specific example, you could look at it that way, but omitting the padding could lead to having a padding of valid(which is the default), which is not ideal for time series.

Deepti_Prasad · November 15, 2024, 4:55pm

hi @dsfasfuqwjoasjsad

but one needs to understand the significance of adding padding casual here is not just related to data leakage at t30 but in each time step, and the significance of casual padding in each time steps prevent data leakage at each time step with help of adding elements to the start of the data, which also aids in forecasting the values of early time steps. allowing sequential transfer of data in time series.

Regards
DP

balaji.ambresh · November 16, 2024, 5:11am

@dsfasfuqwjoasjsad

You are correct in pointing out that we only care about the prediction at the target timestep. So, no matter how we use the training data, we’ll end up with the prediction for the final timestep. That said, causal padding helps the model shine at inference time. Since we pad at the start, even with fewer / no available values, the model tries to mimic the training data when generating values.

Please read this as well.

dsfasfuqwjoasjsad · March 4, 2025, 8:17pm

thank you !

dsfasfuqwjoasjsad · March 4, 2025, 8:17pm

thank you so much !

dsfasfuqwjoasjsad · March 4, 2025, 8:20pm

very good clarification !

Topic		Replies	Views
Centered Moving Average Sequences, Time Series and Prediction week-module-1	4	619	January 12, 2023
Padding optimization Build Basic Generative Adversarial Networks week-module-1	2	508	August 18, 2022
LSTM future values prediction Sequences, Time Series and Prediction week-module-4	5	525	October 10, 2022
Question regarding handling missing data in features AI Discussions ai-discussions , project	3	131	May 20, 2024
Week1 building RNN step by step assignment - questions about input data dimension Sequence Models coursera-platform	7	659	July 6, 2021

Using `padding=casual` for time series prediction

Related topics