Recently I’ve managed to complete my Week 3 assignment, I tried experimenting with the model I have used there and as a result, I got the results that are kind of “too optimistic” I’d say.
Namely, I have built a model that comprises [code removed - moderator] . With such model I’ve managed to get a training and validation accuracy of 100%, with both training and validation losses decreasing after each epoch.
However, I have some doubts about the model. This may sound silly, but am I missing something here? The model does not seem to overfit, training took about an hour to complete which is not the worst considering such models.
I am just curious about the unreasonable effectiveness of my model, should there be more metrics included (besides accuracy) for evaluation to get a bigger picture? If someone could help me interpret those results I would be really grateful. I have attached the plots below so you could take a closer look.
One of the factors to consider when training a model is the compute budget. The assignment starter code has epochs set to 20. So, it’s better to leave the default setting and build your model accordingly.
This assignment is relatively simple when compared to other real world datasets where the corpus and vocabulary dimensions are huge. So, it’s alright to get 100% accuracy on both training and validation sets.
As far as metrics for this assignment is concerned, the training and validation set labels have an almost balanced distribution. Given that prediction of both classes are important to us, accuracy is a valid metric to use. Here’s the label distribution:
Distribution of train_labels:
1 72278
0 71722
Distribution of val_labels:
0 8100
1 7900
Do look at the passing criteria for this assignment that requires a reasonable validation loss curve.
I completely forgot that for the purpose of the assignment we only used 10% of the data to speed up the computations, I was convinced that I’ve obtained such metrics for all 1.6m of the tweets, that’s why I was a bit confused. It does make more sense now. I’ll also analyse the validation loss curve in greater detail. Thank you for the clarification and all the hints!