W1 A2 How is the model able to learn?

In the “Dinosaurus Island Character Level Language Model” assignment, HOW is the model learning when the loss remains almost the same as the initial loss, around 23?

This lab is meant for learning purposes on sequence modeling (character level). As stated in the markdown at the end of the notebook, please experiment with different hyperparameters for better results.

That said, have you seen this?

Thank you! However, I’m not worried about the model’s performance, as it is performing fine with the current hyperparameters. What I’m curious about is how the model is learning this well even when the loss does not decrease. If you take a look at the loss during the training it stays around 23. Nevertheless, the model still learns well.

I believe the loss is decreasing, just not very much. Because if it didn’t decrease, the accuracy would not increase. In classification, Cost and Accuracy both measure essentially the same thing.

1 Like