I am working on disease (sepsis) forecasting using Deep Learning (LSTM). The sepsis data is EHR-time-series data. Where, the target variable is SepsisLabel
. The 0
represents No-sepsis
and 1
represents sepsis. Each patient data is converted to a fixed-length tensor
. I want to make a LSTM model that will take these tensors and train on it, and will forecast the sepsis probability. The threshold is 0.5
. Patients with probability > 0.5 will be sepsis
and patients with probability < 0.5 will be no-sepsis
.
The data in the form of tensors:
The model architecture:
# construct inputs
x = Input((None, x_train.shape[-1]) , name='input')
mask = Masking(0, name='input_masked')(x) # Masking layer because data is post-padded with zeros
# stack LSTMs
lstm_kwargs = {'dropout': 0.20, 'recurrent_dropout': 0.1, 'return_sequences': True, 'implementation': 2}
lstm1 = LSTM(200, name='lstm1', **lstm_kwargs)(mask)
lstm2 = LSTM(200, name='lstm2', **lstm_kwargs)(lstm1)
lstm3 = LSTM(200, name='lstm3', **lstm_kwargs)(lstm2)
btch = BatchNormalization()(lstm3)
dns = Dense(50, name = 'Dense')(btch)
# output: sigmoid layer
output = TimeDistributed(Dense(1, activation='sigmoid'), name='output')(dns)
model = Model(inputs=x, outputs=output)
# compile model
optimizer = RMSprop(learning_rate=0.001)
model.compile(optimizer=optimizer, loss='binary_crossentropy')
sw = np.ones(shape=(len(y_train),))
history = model.fit(x_train, y_train, sample_weight = sw, batch_size=128, epochs=500, verbose=1)
What model architecture should I use? Also, what optimizer should I use? What loss function should I use? I am thinking of this architecture but am unsure about the choice of loss function and optimizer. The model trained on current architecture gives AUROC=0.75
. How I can achieve high AUROC?
Need suggestions.