Struggling with Passing C3W3 Assignment

Github link to course’s assignment: tensorflow-1-public/C3/W3/assignment/C3W3_Assignment.ipynb at main · https-deeplearning-ai/tensorflow-1-public · GitHub

I have been struggling with this assignment. My validation loss keeps increasing, and I can’t find a way to make it plateau or decrease. I feel like I’m not understanding how to create the layers in my model, and which combination would work best. Is this just a matter of trial and error, or is there a systematic way of doing it? I’m asking this because the model takes over 1.5 hours to train, and if it’s trial and error, I will not be able to finish the
assignment at all.

Here are some details of my model without going into the code:

  • A bidirectional LSTM layer
  • a one dimensional convolution layer
  • a maxpooling layer
  • a dropout set to 0.5
  • a dense layer
  • dropout set to 0.5
  • final dense layer with 1 neuron

For some context, here is a summary of the training:

and here is the loss and validation_loss:
image

Here is the slope of my validation_loss:

I would appreciate any tips regarding how I can understand how to better tweak my model. Thank you!

Does this help?

Do use GPU for training.

Yeah, I was able to get it working. Thanks!

Note on how I fixed it:
By going back to the lectures and researching a bit, I found that convolution layers usually add more complexity to the model. This was seen by the large times it took to train them. Also, higher complexities in a model make it more prone to overfitting. Thus, the solution was to simplify the model by simply removing the convolution layer from the sequence described above. It also decreased the time taken to train considerably while passing the grading guidelines.