As exercise I tried to follow and the copy the ungraded lab 5 of c3w3. I tried using a my own data set of anime synopsis and its review scores from 2-10. I adjusted the last layer of the model to make it a softmax activation with 9 neurons. When I tried training the model, the accuracy was very low and became 0 and the loss just outputted nan. Attached is my notebook and the json file of the dataset I used in a zip file. Could someone check out my code and see what I am doing wrong?
nlp exercise.zip (2.3 MB)
On checking your code; for softmax activation with 9 neurons, your labels should be categorical, representing one-hot encoded vectors, where each score from 2 to 10 maps to a class from 0 to 8. Try this and see if this can help your model train better.
Okay so I edited the notebook where in the prepossessing cell, I added a tf.one_hot encoding for my labels and a depth of 9 given the 9 possible classes. Now model runs when I train it and when I give it a test sentence to predict a score I get a softmax array with 9 indexes. With how I did it, is the score labels of 2-10 exactly mapped to the indexes of the softmax arrays where the score of 2 is mapped to the first index and 10 being mapped to the last index. Also any advice on how to improve the models accuracy? Attached is my updated notebook
nlp exercise (2).zip (2.4 MB)
Nice job getting it to work. While preprocessing, I suggest you do it this way, so you can easily get it back afterwards:
# Apply one-hot encoding to map the scores 2-10 to 0-8
dataset_sequences = dataset.map(
lambda text, label: (vectorize_layer(text), tf.one_hot(label - 2, depth=9)) # Adjust label range
)
This means you can do this to get your actual label, at the end, after the inference:
predicted_scores = tf.argmax(result1, axis=1) + 2 # Add 2 to shift from 0-8 to 2-10
print(predicted_scores.numpy()) # Convert to numpy array if needed
Regarding the fact that you got the same output for two different sentences, you can investigate the data again for class imbalances. Maybe some of the labels are featured more in the dataset. Improving the model will require a lot of experimentation to see what works. And sometimes, a combination of model and dataset has a performance threshold that may be hard to surpass. Experimenting will allow you know whether you have reached such a threshold for your model.