Hi all, I am working on a problem to detect outliers using Anomaly detection. While working on Python we were using “Isolation Forest” for the same. Now we are moving to larger datasets, and the data would be in GBs and TBs.
What would be the best framework here Spark, Dask or Distributed Tensorflow? And what ML model would be compatible to be used with the said framework.
Do let me know if you need any more information from me.
Thanks