Tried following the labs but my own model is getting 0 accuracy and nan loss

benvuong · October 5, 2024, 11:44pm

As exercise I tried to follow and the copy the ungraded lab 5 of c3w3. I tried using a my own data set of anime synopsis and its review scores from 2-10. I adjusted the last layer of the model to make it a softmax activation with 9 neurons. When I tried training the model, the accuracy was very low and became 0 and the loss just outputted nan. Attached is my notebook and the json file of the dataset I used in a zip file. Could someone check out my code and see what I am doing wrong?
nlp exercise.zip (2.3 MB)

lukmanaj · October 6, 2024, 3:40pm

On checking your code; for softmax activation with 9 neurons, your labels should be categorical, representing one-hot encoded vectors, where each score from 2 to 10 maps to a class from 0 to 8. Try this and see if this can help your model train better.

benvuong · October 6, 2024, 3:45pm

Okay so I edited the notebook where in the prepossessing cell, I added a tf.one_hot encoding for my labels and a depth of 9 given the 9 possible classes. Now model runs when I train it and when I give it a test sentence to predict a score I get a softmax array with 9 indexes. With how I did it, is the score labels of 2-10 exactly mapped to the indexes of the softmax arrays where the score of 2 is mapped to the first index and 10 being mapped to the last index. Also any advice on how to improve the models accuracy? Attached is my updated notebook
nlp exercise (2).zip (2.4 MB)

lukmanaj · October 7, 2024, 4:47pm

Nice job getting it to work. While preprocessing, I suggest you do it this way, so you can easily get it back afterwards:

# Apply one-hot encoding to map the scores 2-10 to 0-8
    dataset_sequences = dataset.map(
        lambda text, label: (vectorize_layer(text), tf.one_hot(label - 2, depth=9))  # Adjust label range
    )

This means you can do this to get your actual label, at the end, after the inference:

predicted_scores = tf.argmax(result1, axis=1) + 2  # Add 2 to shift from 0-8 to 2-10
print(predicted_scores.numpy())  # Convert to numpy array if needed

Regarding the fact that you got the same output for two different sentences, you can investigate the data again for class imbalances. Maybe some of the labels are featured more in the dataset. Improving the model will require a lot of experimentation to see what works. And sometimes, a combination of model and dataset has a performance threshold that may be hard to surpass. Experimenting will allow you know whether you have reached such a threshold for your model.

Topic		Replies	Views
TF C3W3 assignment results too good Natural Language Processing in TensorFlow week-3	7	76	December 15, 2023
C3W2 Train accuracy high but validation accuracy very low Natural Language Processing in TensorFlow week-3	17	27	February 16, 2025
Week 3 Assignment - help with interpreting results Natural Language Processing in TensorFlow	2	338	December 22, 2022
T1_C3_Week 4 Assignment Natural Language Processing in TensorFlow week-4	3	331	January 9, 2023
C3W1_Assignment: Deep N-Grams NLP with Sequence Models week-1	10	1186	May 27, 2024

Tried following the labs but my own model is getting 0 accuracy and nan loss

Related topics