What should be the LSTM model architecture in order to forecast disease probability?

huzaifa_arshad · September 2, 2022, 12:07pm

I am working on disease (sepsis) forecasting using Deep Learning (LSTM). The sepsis data is EHR-time-series data. Where, the target variable is SepsisLabel. The 0 represents No-sepsis and 1 represents sepsis. Each patient data is converted to a fixed-length tensor. I want to make a LSTM model that will take these tensors and train on it, and will forecast the sepsis probability. The threshold is 0.5. Patients with probability > 0.5 will be sepsis and patients with probability < 0.5 will be no-sepsis.

The data in the form of tensors:

x_train
x_test
y_train
y_test

The model architecture:

# construct inputs
x = Input((None, x_train.shape[-1]) , name='input')
mask = Masking(0, name='input_masked')(x) # Masking layer because data is post-padded with zeros

# stack LSTMs
lstm_kwargs = {'dropout': 0.20, 'recurrent_dropout': 0.1, 'return_sequences': True, 'implementation': 2}
lstm1 = LSTM(200, name='lstm1', **lstm_kwargs)(mask)
lstm2 = LSTM(200, name='lstm2', **lstm_kwargs)(lstm1)
lstm3 = LSTM(200, name='lstm3', **lstm_kwargs)(lstm2)

btch = BatchNormalization()(lstm3)

dns = Dense(50, name = 'Dense')(btch)

# output: sigmoid layer
output = TimeDistributed(Dense(1, activation='sigmoid'), name='output')(dns)
model = Model(inputs=x, outputs=output)

# compile model
optimizer = RMSprop(learning_rate=0.001)
model.compile(optimizer=optimizer, loss='binary_crossentropy')

sw = np.ones(shape=(len(y_train),))

history = model.fit(x_train, y_train, sample_weight = sw, batch_size=128, epochs=500, verbose=1)

What model architecture should I use? Also, what optimizer should I use? What loss function should I use? I am thinking of this architecture but am unsure about the choice of loss function and optimizer. The model trained on current architecture gives AUROC=0.75. How I can achieve high AUROC?

Need suggestions.

alvaroramajo · September 2, 2022, 2:33pm

Hi, @huzaifa_arshad !

There are always a couple of settings you can play with to improve the metrics:

First of all, make sure where are you failing at. Is your model underfitting or overfitting? Are your train metrics higher or aprox the same as the test metrics? If underfitting, try with a more complex model, say, with more layers, more neurons, etc.
Binary cross entropy seems just right for this binary classification task. Regarding the optimizer, I’ve always had good performance with Adam so it’s my first choice, but RMSProp could work well too.
Transformers have been doing really well with time series. They were introduced for NLP but it might be worth giving them a try.

huzaifa_arshad · September 2, 2022, 6:40pm

Thanks. I will try with Adam optimizer. Regarding Transformers, I have heard of them but never used them. I will explore them. Thanks

Topic		Replies	Views
Need advise on my personal project model AI Discussions	1	56	November 2, 2022
What is the archetecture for LSTM model in C3_W3_Lab_2_multiple_layer_LSTM.ipynb Natural Language Processing in TensorFlow	1	368	September 5, 2022
Building model for categorical time series Sequences, Time Series and Prediction week-module-4	3	555	June 18, 2022
LSTM future values prediction Sequences, Time Series and Prediction week-module-4	5	525	October 10, 2022
Week 1, 3rd Assignment Problem, Jazz improvization Sequence Models coursera-platform	14	896	August 28, 2021

What should be the LSTM model architecture in order to forecast disease probability?

Related topics