Drastic Underfitting of NN (-300 Million loss in first epoch)

akshatsg · September 18, 2023, 11:05pm

Hi. In the course 3 week 3 assignment, when training the neural network, I am getting a very bizarre accuracy. Overfitting may be something like a decent train accuracy but a low val accuracy, but my train accuracy in the very first epoch itslef, though it starts at a positive accuracy, goes down to about -300 million. Here is my model for reference:

# GRADED FUNCTION: create_model
# moderator edit: code removed

No matter how I tune the architecture of the model, the same thing keeps occurring. I checked to see if there are any errors in previous codes, but the parsing code, as follows gave me the right output:

# GRADED FUNCTION: parse_data_from_file
# moderator edit: code removed

The train val split also gives me the right output:

# GRADED FUNCTION: train_val_split
# moderator edit: code removed

I thought tokenization might have been a problem, but that seems right too:

# GRADED FUNCTION: fit_tokenizer
# moderator edit: code removed

Seq_pad_and_trunc function also seems to be working fine:

# GRADED FUNCTION: seq_pad_and_trunc
# moderator edit: code removed

Overall, I am not sure if it is the model itslef or the data fed into it that is not proper, but the accuracy score of -300 million on the very first epoch itself seems pretty bizarre, and shows a sign of underfitting. Is there anything that I am missing in the previous codes or the model creation? Thank you so much.

Deepti_Prasad · September 19, 2023, 3:44am

Hello akshat,

you are not suppose to share or post any of the assignment codes. it is against community guideline. you can share here you epoch training images which is not part of the codes. Send codes only via DM when asked by a mentor.

you hard coding rows and sentences

Instructions mentions you need to the label can be accessed via row[0] and the text via row[5] (you have used wrong row for this. You while you labelling row append and then recall with if-else statement so it returns each row in every iteration

once you do this you need to append the text (sentence with row [5]

Do not hard code by mentioned sentence = row[5] while labelling row, that is incorrect.

here it is mentioned to compute the number of sentences (so use length function for sentences and not for training split)

you split the sentence and labels from 0 which you do not need.

Try using dropout layer after embedding layer and then use Conv1D but here as you have used a very higher unit and the epoch is only 20 you are getting low accuracy. Use only one dropout layer. Follow the instructions given above the grader cell for model architecture. Again your Dense layer unit is higher. remember epoch train when has a higher unit will affect training model. Your choice of activation also is question as you are using binary crossentropy. Softmax is used for multi class classification. So do you remember which activation one need to use for binary classification for the last layer?

Do all these corrections.

First, remove the codes from the post.

Regards
DP

TMosh · September 19, 2023, 4:56am

I have edited your message to remove the code.

Let me reinforce: Please do not post your code on the forum.

Topic		Replies	Views
C3W3_Assignment High training/validation accuracy after one epoch Natural Language Processing in TensorFlow	7	497	December 29, 2023
Training acc 100% Natural Language Processing in TensorFlow week-module-2 , week-module-3 , week-module-4	4	577	September 5, 2022
Trouble with overfitting Natural Language Processing in TensorFlow	1	402	June 12, 2022
C3 W3 Assignment - Training: low accuracy NLP with Sequence Models week-module-3	2	543	September 21, 2022
C3W2_Assignment Natural Language Processing in TensorFlow week-module-2 , week-module-3 , week-module-4	3	589	July 19, 2022

Drastic Underfitting of NN (-300 Million loss in first epoch)

Related topics