I’m looking for anyone who has experience using this model on supervised or semisupervised applications? I am looking to run both scenarios and was curious to learn if the accuracy was superior to boosted trees for tabular data? I was also interested to learn if this model is interpretable? I understand how transformer models work, but I haven’t built them yet and have not tried to interpret them yet. Any insight is appreciated. I’ll include more info on my project below.
I’m currently working on a personal project. The topic is substance abuse treatment, where I am analyzing 400 features for about 2000 patients, receiving treatment over 24 weeks.
My goal is to create a data model that has enough baseline categorical data to predict treatment outcomes early in treatment. By detecting negative outcomes early in treatment (defined as patient dropout) the model will become clinically useful in practice, helping healthcare providers detect risk signals early in treatment to improve treatment and hopefully improve treatment response and reduce dropout.
I’m going to compare 3 models, Random Forest, XGBoost Classifier and Keras Structured Classification with Tabtransformer.
I’m looking to show incremental improvement from different models, based on complexity.
Thanks for chiming in. I see where you’re coming from.
For XGBoost, I think I will be fine. I’ve worked with it before on wide datasets and it works well in finding signal. I am currently in the process of experimenting with different data models, I will start with a lot of features and iterate down to the best predictors.
You may be right for tab transformer. I went through the tutorial for tab transformer. They used 50k examples.
I considered creating a synthetic dataset. I am not familiar with how that works. But I’m going to do some research soon.
Mr TMosh, what I learned in this scenario is that the individual model determines requirements for features and samples.
After speaking with an industry expert on XGBoost, the guidelines are as follows.
N observations must > N features
Minimum of N=100 training samples
Does not work well with computer vision and NLP, those are more appropriate for deep learning.
For tabular data, XGBoost provides portability to work on distributed systems, supporting several hundred million instance datasets, providing superior accuracy and performance over most models for tabular data.
Keras offers structured classification with feature space and shows good accuracy on low num of samples. However, you are keen to pointing out that it may not work for tabtransformer.
That’s my analysis from the short research I just completed. Hopefully that’s helpful if anyone has similar questions when forming their ML strategy.