Hi. In the course 3 week 3 assignment, when training the neural network, I am getting a very bizarre accuracy. Overfitting may be something like a decent train accuracy but a low val accuracy, but my train accuracy in the very first epoch itslef, though it starts at a positive accuracy, goes down to about -300 million. Here is my model for reference:
# GRADED FUNCTION: create_model
# moderator edit: code removed
No matter how I tune the architecture of the model, the same thing keeps occurring. I checked to see if there are any errors in previous codes, but the parsing code, as follows gave me the right output:
# GRADED FUNCTION: parse_data_from_file
# moderator edit: code removed
The train val split also gives me the right output:
# GRADED FUNCTION: train_val_split
# moderator edit: code removed
I thought tokenization might have been a problem, but that seems right too:
# GRADED FUNCTION: fit_tokenizer
# moderator edit: code removed
Seq_pad_and_trunc function also seems to be working fine:
# GRADED FUNCTION: seq_pad_and_trunc
# moderator edit: code removed
Overall, I am not sure if it is the model itslef or the data fed into it that is not proper, but the accuracy score of -300 million on the very first epoch itself seems pretty bizarre, and shows a sign of underfitting. Is there anything that I am missing in the previous codes or the model creation? Thank you so much.
you are not suppose to share or post any of the assignment codes. it is against community guideline. you can share here you epoch training images which is not part of the codes. Send codes only via DM when asked by a mentor.
you hard coding rows and sentences
Instructions mentions you need to the label can be accessed via row[0] and the text via row[5] (you have used wrong row for this. You while you labelling row append and then recall with if-else statement so it returns each row in every iteration
once you do this you need to append the text (sentence with row [5]
Do not hard code by mentioned sentence = row[5] while labelling row, that is incorrect.
here it is mentioned to compute the number of sentences (so use length function for sentences and not for training split)
you split the sentence and labels from 0 which you do not need.
Try using dropout layer after embedding layer and then use Conv1D but here as you have used a very higher unit and the epoch is only 20 you are getting low accuracy. Use only one dropout layer. Follow the instructions given above the grader cell for model architecture. Again your Dense layer unit is higher. remember epoch train when has a higher unit will affect training model. Your choice of activation also is question as you are using binary crossentropy. Softmax is used for multi class classification. So do you remember which activation one need to use for binary classification for the last layer?