C3W3 Assignment - Can't Get Validation Loss Flat or Declining

I am having trouble completing the Week 3 Graded assignment.

i have tried several different architectures for the RNN. I’ve added drop layers, tried two bidirectional LSTMs, tried simple convolution and GRU. My validation loss always is increasing no matter what.

I don’t know what else to try considering in all of the examples for different types of layers the validation loss is also increasing in each of those.

What hyperparamerters should I focus on?

Hello Patrick,

Can I know what kind of trouble you are having?

Can you explain your model without sharing any codes

Kindly explain if you have used all the points mentioned in the image with what parameters.

increase in validation loss indicates you need to look into your model as well as parse_data and train-val split.

Please do not post any codes here. Only shared your output with the expected output. So we can both learn and discuss on how to improvise your model.

Regards
DP

Hi.

I actually figured out the issue. thanks!!

Great, Do let everyone know how you got to debug your issue, it can help future learners facing similar issue.

Happy Learning!!

Regards
DP

Hi,

are we allowed to edit the train-val split?

As the Hyperparameters? As the assignment says that we are “welcome to change these after submitting” I thought we have to reach the slope only via the model design.

Could you confirm that? :smiley:

Best,
Kalle

Hello @Karl-Heinz_Wallwitz,

are you stating this split edit for your own personal practice or for the assignment??

Regards
DP

1 Like

For the assignment.

Hello @Karl-Heinz_Wallwitz,

You can edit anything between ##START AND END CODE HERE for train val split grader cell, but one needs to make sure you are able to get the expected output as shown in the assignment.

Regards
DP

Thanks for the really fast reply again.

Thought that. So the numbers have to stay as they are. I guess the solution is within the model then. :slight_smile:

Best,
Kalle

Hello @Karl-Heinz_Wallwitz,

I cannot state the solution as I do not know what is the problem with your model or assignment. So for better suggestion or response, kindly let me know where you are stuck or if you are getting any error.

Regards
DP

1 Like

In case you are not getting the expected output, parameters you can change to see different results would be your loss or optimizer. Also check the previous comment I shared which mentions all the instructions to follow for model creation.

Reviewing if you have passed all the previous test with match to expected output.

Still being stuck, then let me know.

Regards
DP

I am just stuck with getting an acceptable slope. I tried many different layers and architectures.

Also: If I get frustrated I’ll write here again. Thanks for you help :slight_smile:

@Karl-Heinz_Wallwitz

can you show the image of your slope that you are getting?

I’ll do when my next try ran through. I usually end up at 0.6. First 10 epochs are often relatively constant and after that it starts to overfit.

One problem is, that even if I use small nets, it takes really long to get to a point where the overfitting starts. Therefore I nearly never let it run through so I could plot it. :wink:

Got it. My first attemts were just to complicated. :smiley:

Let everyone know what changes you made to get the expected output of slope, so other learners could also learn.

Regards
DP

With to complicated, I mean that I had to many layers with to many units. Stepping down and checking helped figuring it out.

Sorry for the delay. They key to solving this for me was simplifying my model by doing the following:

  • I used only one LSTM layer
  • I used on MaxPoolingLayer
    *Increased the size of my dropot layer to prevent overfitting
  • I used a dropout layer between my two Dense layers as well with the same size of .5

The key here was to keep the model simple. Complicated models can lead to overfitting.