C3W3 Assignment - Can't Get Validation Loss Flat or Declining

Patrick_Bentley · December 26, 2023, 5:12pm

I am having trouble completing the Week 3 Graded assignment.

i have tried several different architectures for the RNN. I’ve added drop layers, tried two bidirectional LSTMs, tried simple convolution and GRU. My validation loss always is increasing no matter what.

I don’t know what else to try considering in all of the examples for different types of layers the validation loss is also increasing in each of those.

What hyperparamerters should I focus on?

Deepti_Prasad · December 26, 2023, 5:48pm

Hello Patrick,

Can I know what kind of trouble you are having?

Can you explain your model without sharing any codes

Kindly explain if you have used all the points mentioned in the image with what parameters.

increase in validation loss indicates you need to look into your model as well as parse_data and train-val split.

Please do not post any codes here. Only shared your output with the expected output. So we can both learn and discuss on how to improvise your model.

Regards
DP

Patrick_Bentley · January 7, 2024, 11:37pm

Hi.

I actually figured out the issue. thanks!!

Deepti_Prasad · January 8, 2024, 6:51am

Great, Do let everyone know how you got to debug your issue, it can help future learners facing similar issue.

Happy Learning!!

Regards
DP

Karl-Heinz_Wallwitz · January 17, 2024, 1:15pm

Hi,

are we allowed to edit the train-val split?

As the Hyperparameters? As the assignment says that we are “welcome to change these after submitting” I thought we have to reach the slope only via the model design.

Could you confirm that?

Best,
Kalle

Deepti_Prasad · January 17, 2024, 1:18pm

Hello @Karl-Heinz_Wallwitz,

are you stating this split edit for your own personal practice or for the assignment??

Regards
DP

Karl-Heinz_Wallwitz · January 17, 2024, 1:20pm

For the assignment.

Deepti_Prasad · January 17, 2024, 1:28pm

Hello @Karl-Heinz_Wallwitz,

You can edit anything between ##START AND END CODE HERE for train val split grader cell, but one needs to make sure you are able to get the expected output as shown in the assignment.

Regards
DP

Karl-Heinz_Wallwitz · January 17, 2024, 1:31pm

Thanks for the really fast reply again.

Thought that. So the numbers have to stay as they are. I guess the solution is within the model then.

Best,
Kalle

Deepti_Prasad · January 17, 2024, 1:34pm

Hello @Karl-Heinz_Wallwitz,

I cannot state the solution as I do not know what is the problem with your model or assignment. So for better suggestion or response, kindly let me know where you are stuck or if you are getting any error.

Regards
DP

Deepti_Prasad · January 17, 2024, 1:45pm

In case you are not getting the expected output, parameters you can change to see different results would be your loss or optimizer. Also check the previous comment I shared which mentions all the instructions to follow for model creation.

Reviewing if you have passed all the previous test with match to expected output.

Still being stuck, then let me know.

Regards
DP

Karl-Heinz_Wallwitz · January 17, 2024, 1:47pm

I am just stuck with getting an acceptable slope. I tried many different layers and architectures.

Also: If I get frustrated I’ll write here again. Thanks for you help

Deepti_Prasad · January 17, 2024, 1:49pm

@Karl-Heinz_Wallwitz

can you show the image of your slope that you are getting?

Karl-Heinz_Wallwitz · January 17, 2024, 1:52pm

I’ll do when my next try ran through. I usually end up at 0.6. First 10 epochs are often relatively constant and after that it starts to overfit.

One problem is, that even if I use small nets, it takes really long to get to a point where the overfitting starts. Therefore I nearly never let it run through so I could plot it.

Karl-Heinz_Wallwitz · January 17, 2024, 2:18pm

Got it. My first attemts were just to complicated.

Deepti_Prasad · January 17, 2024, 2:21pm

Let everyone know what changes you made to get the expected output of slope, so other learners could also learn.

Regards
DP

Karl-Heinz_Wallwitz · January 18, 2024, 4:03pm

With to complicated, I mean that I had to many layers with to many units. Stepping down and checking helped figuring it out.

Patrick_Bentley · January 18, 2024, 8:58pm

Sorry for the delay. They key to solving this for me was simplifying my model by doing the following:

I used only one LSTM layer
I used on MaxPoolingLayer
*Increased the size of my dropot layer to prevent overfitting
I used a dropout layer between my two Dense layers as well with the same size of .5

The key here was to keep the model simple. Complicated models can lead to overfitting.

AchintyaGahalaut · August 13, 2024, 10:34pm

When you say you used a Maxpooling layer, is that after using a convolution? I did not know you can use maxpooling layers without using convolutions. Thanks!

Topic		Replies	Views
Struggling with Passing C3W3 Assignment Natural Language Processing in TensorFlow week-3	2	108	August 14, 2024
Has anyone been able to score a good validation accuracy with the C3W3 assignment model? Natural Language Processing in TensorFlow	3	482	August 31, 2022
Validation accuracy is always 1 Natural Language Processing in TensorFlow week-3	5	302	May 23, 2023
C3W2 Train accuracy high but validation accuracy very low Natural Language Processing in TensorFlow week-3	17	25	February 16, 2025
C3W3 test glitching Natural Language Processing in TensorFlow week-2 , week-3 , week-4	5	540	October 26, 2022

C3W3 Assignment - Can't Get Validation Loss Flat or Declining

Related topics