I have the following 3 curves - am looking for advice which one is the best? they all give similar accuracy metric at about 50% -53% (yes, I know it is not very high).
- https://github.com/bluetail14/MyCourserapractice/blob/main/lstm_3264.jpg
- https://github.com/bluetail14/MyCourserapractice/blob/main/lstm_6432.jpg
3.https://github.com/bluetail14/MyCourserapractice/blob/main/epochs_120.jpg - https://github.com/bluetail14/MyCourserapractice/blob/main/dropout_0.4.jpg
My guess is that no. 4 is the best because the loss seems to be decreasing.
I have tried changing dimensions, adding and dropping dense and dropout layers, maxlen and embeddings, however, I don’t seem to get any improvements.
I will appreciate any suggestions as a beginner! Thank you.
I have used a LSTM architecture as follows for a movie reviews dataset from here (Sentiment Analysis ).
# Parameters
EMBEDDING_DIM = 100 (from 100d.glove.6B.100d.txt)
MAXLEN = 500
VOCAB_SIZE = 33713
DENSE1_DIM = 64
DENSE2_DIM = 32
LSTM1_DIM = 64# 32
LSTM2_DIM = 32# or 64
FILTERS = 64 #
KERNEL_SIZE = 5
# Model Definition
model_lstm = tf.keras.Sequential([
tf.keras.layers.Embedding(VOCAB_SIZE+1, EMBEDDING_DIM, input_length=MAXLEN,weights=[EMBEDDINGS_MATRIX], trainable=False),
tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(LSTM1_DIM,dropout = 0.2, return_sequences=True)),
tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(LSTM2_DIM, dropout = 0.2)),
tf.keras.layers.Dense(DENSE1_DIM, activation='relu'),
tf.keras.layers.Dense(DENSE2_DIM, activation='relu'),
tf.keras.layers.Dense(1, activation='sigmoid')
])
# Set the training parameters
model_lstm.compile(loss='binary_crossentropy',
optimizer=tf.keras.optimizers.Adam(),
metrics=[tf.keras.metrics.BinaryAccuracy()])
# Print the model summary
model_lstm.summary()
Layer (type) Output Shape Param #
embedding_37 (Embedding) (None, 500, 100) 3371400
bidirectional_61 (Bidirecti (None, 500, 128) 84480
onal)bidirectional_62 (Bidirecti (None, 64) 41216
onal)dense_109 (Dense) (None, 64) 4160
dense_110 (Dense) (None, 32) 2080
dense_111 (Dense) (None, 1) 33
=================================================================
Total params: 3,503,369
Trainable params: 131,969
Non-trainable params: 3,371,400
P.S. a simple Naive Bayes gives a 84% accuracy for this dataset!