Hello, @Bot001,
Thank you for the detailed work and analysis.
Let me start with explaining what my last message meant and it’s simple: reconstruct the maths that produces the forecast result. The message is simple but the work can take some time. For example, the paper says they used “Seasonal and Trend decomposition by Loess” (STL) for extracting seasonal (and trend) components and this means “can we reproduce this algorithm?” Your Markdown shows that you downloaded some parameters (e.g. p, d, q), but STL is a non-parametric algorithm which means you may not be able to just download it but you have to implement it.
However, from your work, I see you were trying to get straight to your demand, so let us not bother by the maths for the time being and let me share one thing I found with the help of the Google AI Mode.
It told me that we can export the ARIMA+ model in Tensorflow format. If so, you can run the model anywhere including your computer. You will need to check the model’s input signature for what it accepts, but from what I understand, you will have to provide three things:
- the horizon - number of future points to forecast
- the time series data
- a ISO format timestamp that represents the start time of your time series.
The number two is the most interesting part because, in principle, you can include post-training data and I believe this is your goal. My understanding is that the exported model should contain everything including the STL algorithm and trained parameters in a frozen state, so I believe the p, d, q values will be embedded there as well. However, I didn’t test any of these because I have not used ARIMA+, but I believe it won’t be too difficult for you to find the instructions and test these assumptions, because afterall, you can also ask the Google AI Mode
.
If all of these work out, then we can evaluate how well an “old” model performs and decide how often to retrain.
Now I think we can briefly go back to the maths, but I will also have to say that, with all the above working out, there is no need to care about it at all. However, a good understanding of the maths will definitely help you make decisions including, at the very least, how to use the components as the input of your other model (RNN or GBDT).
I will quickly go through your questions at the end of the Markdown:
Q1: I believe exporting it to Tensorflow format worths a try, but you will need to find out if it is possible and, if so, how. Regarding API, in my conversation with the Google AI Mode, I found that this does not show up in your Markdown, but I will leave it to you to decide how useful it is.
Q2: I think you can somehow use them as the input for a model (not necessarily GBDT). For example, ARIMA+ does clean your data by removing outliners, spikes, and so on (for more, see the paper). My idea is that it is highly likely easier to train a model with cleaner data. That’s it.
For your usage examples in the question, I don’t intend to just say yes/no to them. The idea here is that, any contribution from the ARIMA+ to your RNN / GBDT may make the model learn easier but the degree to which cannot be told in advance and so can’t the tradeoff due to the extra cost of running ARIMA+. Therefore, whether and how you use ARIMA+ for your RNN/GBDT is really an iterative process that takes performance and costs into account, and I can’t just predict it here. Another reason I can’t predict it is because I just know nothing about the behavior of your data. I hope these make sense to you.
Q3: The reliability has to be told by good understanding of the system that generates the data, by experiment and by evaluation. For the first part, you need to consult the domain expert of the system: do the seasonal patterns have any reason to be stable? If there is a reason that it’s unstable, it’s unlikely that you can reliably use those from ML.EXPLAIN_FORECAST for deep into the future.
Q4: I think this leads us back to my answer for your Q1. Besides, I didn’t really read the whole paper, so you can’t rely on me for giving you everything public about ARIMA+. However, I think it’s only going to be useful to have a good understanding of the paper, and we can read it with the help of the Google AI Mode.
Lastly, you have mentioned about ensemble model and also LightGBT. Comparing LightGBT with RNN, the “same” part is that ARIMA+ can contribute, but the “different” part is that you will need to engineer features for LightGBT and this part can be challenging.
Cheers,
Raymond