Need guidance in selecting the right architecture for Log anomaly detection

Hi Friends,

I am creating an anomaly detection model for metric and log data. I am somehow found LSTM with autoencoders seems to be a better choice. Since, I am a newbie and not sure if I am in right direction in selecting the right combination of architectures/technologies.

Below are my requirement constraints:

  1. Multi Variate time series
  2. Unsupervised learning
  3. Stationary data
  4. Batch processing of log/metric data

Having these requirements in considerations, Can I go with “LSTM + Autoencoders” for anomaly detection? Can the experts guide me.

Really Appreciate the help in advance.

  1. An LSTM is a good match for a time series.
  2. I have not used an LSTM autoencoder, so have no comment on that.
  3. Not sure what “stationary data” means in the context of your project.
  4. Whether you use batch processing is a decision for how to implement and train the model, I’m not sure why that would be a “requirement constraint”. Can you say more about this?
1 Like

Hi @TMosh ,

Thanks for your reply.

  1. LSTM - Considering the factor that I need to work with multi variate time series me too feel LSTM would be better fit. Others considered was OCSVM, Isolation Forest. Any Transformer based models are better than LSTM ? (BERT/RoBERTa etc )

  2. Stationary Data- I mean the log time series are relatively constant in nature.,

Realtime Example: Daily temperature readings in a city where the temperature remains relatively constant over time with minor fluctuations

  1. I mentioned it as a constraint because by nature we need to select the appropriate algorithm based on the input nature (batch/ stream etc )

we get multiple log statements together in batch (consider 5 mins of logs at a time) rather than streaming where we would be getting single or couple of lines.