C3W3_Assignment High training/validation accuracy after one epoch

Shahd_Al_Hares · December 26, 2023, 1:10am

Hi everyone,

I’m not sure if I built the perfect model for this assignment but I am having around 100% accuracy after only one epoch for both the training and validation set. My loss and validation loss are also very small at the beginning of training.

Please take a look at the following plots. Am I missing something?

grafik

TMosh · December 26, 2023, 1:18am

I’ve seen this issue before, but don’t recall the cause (I’m not a mentor for that course).

Maybe you can read back through the thread history for this course forum area. Or use the forum Search tool for “accuracy”.

Deepti_Prasad · December 26, 2023, 5:44pm

So what does the graph indicates overfitting?? as the assignment name also indicates Exploring Overfitting in NLP

things to look into

if your parse data from file has the below codes correctly defined
csv.readerreturns an iterable that returns each row in every iteration. So the label can be accessed viarow[0]and the text viarow[5]`.
The labels are originally encoded as strings (‘0’ representing negative and ‘4’ representing positive). You need to change this so that the labels are integers and 0 is used for representing negative, while 1 should represent positive.
Next in Training validation split,
if you have defined the len of the sentences in correct so the value is integer.
Can you explain based on the below screenshot, how did you create model. Do not share codes. you could just explain how many layers you used, what activation you used, how many dense layers etc. Also in model.compile, loss, optimizer and accuracy.
Another important point for this model algorithm is the below line
This is how you need to set the Embedding layer when using pre-trained embeddings
what vocab size, weights you used.

Regards
DP

Shahd_Al_Hares · December 26, 2023, 10:25pm

I’m not sure if this is overfittig? I recall that overfitting occurs on the training set but in my case it’s overfitting on both sets?

This part of my code is correct. The test function works and I can see the expected output.
The variable train_size is an integer. Also here, the test function works.
I tried two different architectures:
- The first one is a model with Embedding, Conv1D, GlobalMaxPooling1D, and two Dense layers with a Dropout layer between them.
- The second one is a model with Embedding, Dropout, Bidirectional LSTM and two Dense layers.
  The activation of the output layer is sigmoid, all other layers have a Relu activation function. The loss function is binary_crossentropy and the optimizer is adam.
The Embedding layer is already provided and I haven’t changed the code here. It has the following arguments:
- Input dimension is vocabulary size +1
- Output dimension is embedding dimension
- Input length is maximum length of all sequences
- The weights are equal to the embedding matrix
- trainable is set to false

I’ve already submitted my code and it did pass

Deepti_Prasad · December 27, 2023, 8:54am

Ok great. if you already solve, then kindly close the thread by choosing a comment which solved your issue or explaining how you solved a issue yourself.

Shahd_Al_Hares · December 27, 2023, 2:39pm

I meant my code which produces these two graphs did pass with 100/100 after my submission. The slope of the val_loss is 0 and that’s why all tests are fine. But I still think that the trained model ist not correct.

Deepti_Prasad · December 27, 2023, 2:47pm

this is correct

This is correct, but what was your unit for the last two dense layer.

The second one is a model with Embedding, Dropout, Bidirectional LSTM and two Dense layers.
The activation of the output layer is sigmoid, all other layers have a Relu activation function. The loss function is binary_crossentropy and the optimizer is Adam.

in model compile, what accuracy you used?

Also did you notice after model training the below statement

To pass this assignment your val_loss (validation loss) should either be flat or decreasing.

Although a flat val_loss and a lowering train_loss (or just loss) also indicate some overfitting what you really want to avoid is having a lowering train_lossand an increasing val_loss.

Probably that’s why you cleared the assignment submission.

Explore your model with this pointer,

Try simpler architectures first to avoid long training times. Architectures that are able to solve this problem usually have around 3-4 layers (excluding the last two Dense ones)

Regards
DP

Shahd_Al_Hares · December 29, 2023, 11:41pm

I found the mistake in my code. When changing the labels from 0 or 4 to 0 or 1, I was checking for integer values instead of string. Therefore, all my labels had one as a value. Now I check for string values and the results seem to be reasonable.

Thank you!

Topic		Replies	Views
Flat Validation Loss in Assignment Model Natural Language Processing in TensorFlow	2	354	September 19, 2022
Week 3 Assignment - help with interpreting results Natural Language Processing in TensorFlow	2	338	December 22, 2022
TF C3W3 assignment results too good Natural Language Processing in TensorFlow week-3	7	76	December 15, 2023
C3W2 Train accuracy high but validation accuracy very low Natural Language Processing in TensorFlow week-3	17	27	February 16, 2025
Has anyone been able to score a good validation accuracy with the C3W3 assignment model? Natural Language Processing in TensorFlow	3	482	August 31, 2022

C3W3_Assignment High training/validation accuracy after one epoch

Related topics