Scaling multivariate LSTM time-series forecasting

Hello everyone, I have just started learning about multivariate time series forecasting using LSTM. I am confused about the data preprocessing part where we need to scale the data. Do we need to scale all data (input and target) or just the input?

In some cases, there is a distinction between the input scaler (usually named scaler_X) and the target scaler (scaler_Y). Can’t we just use one scaler if all the data needs to be scaled?

In general (this is not unique to LSTMs), scaling the features allows the optimizer (which finds the weights that minimize the cost) to work more efficiently.

Usually scaling the outputs is not very useful.

Depending on the dataset (and the range of magnitudes of the outputs), you might also scale the output labels.

thankyou for your answer, but could you explain more what you mean by “depending on the dataset”?
what kind of dataset that required us to scale the output labels?

Only if the output values vary over a large range of values, or if they are numerically very large.

okay, i understand.
thankyou very much @TMosh

1 Like