In this week’s assignment, I have used
Conv1D(128, 6 , activation=‘relu’),
GlobalAveragePooling1D(),
Three Dense() layers - eg. Dense(128, activation= ‘relu’), and Dense (64, activation = ‘relu’).
Two Droupout(0.2) layers.
I have got very erratic curves for validation. see the picture here.
how can I smoothen them to look more like in the examples? what do I need to change?
thank you
Which assignment is this for?
the assignment for Week 3, C3W3_Assignment.
I think you should be using lstm with dropouts. Great thing about lstm is you can add dropouts right in the parameters. Instead of making a whole new layer. You are probably using to many layers as well. maybe 2 lstm layers with a drop out with 2 dense layers should be plenty. Just remember to add the return_sequences to true in your parameters. If you have everything else it should run fine. Tell me how it goes. Good luck
I have tried two LSTM layers and 2 Dense layers()
tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(64, dropout=0.2, return_sequences=True)),
# tf.keras.layers.LSTM(32),
tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(32)),
tf.keras.layers.Dense(8, activation='relu'),
tf.keras.layers.Dense(1, activation='sigmoid')
It still looks jagged, nothing like in the examples.
Did you set up an embedding? Mine had 1 embedding layer, 1 lstm, one dropout, and 2 dense layers
yes, it works now - yet, my validation loss curve still looks jagged.
what does yours look like? does it look like any of the examples.
Hey bluetail!
I might try using a slightly simpler model architecture here - and avoid using the built-in dropout in the LSTM feature.
I was able to see suitable results with that approach - I hope that helps you!
Thanks,
Chris
2 Likes
Jagged wouldn’t be my first concern. That would be flat. By which I mean it doesn’t look like the training loss decreases, which means underfitting or not enough learning. If your validation loss was smooth but the model isn’t learning, that is still undesirable. So I would focus on the model and the training accuracy/loss first. See if @CSAlexiuk advice helps you there.
1 Like