At first I used Huber loss for this assignment but later it was not accepting saying I was supposed to use “MeanSquaredError” in one of the errors.
Anyway, While using the Huber loss, the learning rate plot was generating valid curves but while using the MeanSquaredError loss, that plot was empty! But I went with that anyway with a learning rate and my assignment passed! I’m very surprised by the fact that why it would work when the learning curve plot generated nothing!
Any idea on this would be highly appreciated. Thank you.
the huberloss use section was only provided for learner to understand how learning rate has an effect on the model training speed. the adjust learning rate exercise comes with a note
Notice that this is only changing the learning rate during the training process to give you an idea of what a reasonable learning rate is and should not be confused with selecting the best learning rate, this is known as hyperparameter optimization and it is outside the scope of this course.
However another reason for assignment required you to use MSE instead of Huber loss
The “delta” value in Huber loss determines where the loss function transitions from quadratic to linear, requiring careful selection based on the data, which can be challenging in time series where patterns might be complex and evolving.
huber loss also is sensitive to outlier detection, so it would be difficult to detect the outlier in time series due to seasonality and trends.
Remember time series model comes with seasonality as well as noise, making mse or mae, a better loss choice when comes to detecting any changes with seasonality.
Thank you so much for explaining the reasons why MSE is better to use here. My follow-up question would be - after applying huber loss at first, I got a plot with a learning rate. But while using the MSE loss in this case, I got no plotting in for the learning rate(attached picture). Is this right? How can I decide on a learning rate when I get no value for learning rate! I went ahead with a learning rate anyway which I got after plotting with Huber loss and that worked but that is just random with reference to this empty plot! Any input on this will be of much help. Thank you.
I am glad that you noticed and had this query. But perhaps you missed the part that loss must have come as NaN(not a number). So when a loss becomes nan, the learning rate either spikes or becomes a flat line.
For my model training, loss started from 274 at 1st epoch and became Nan at epoch 44 consistent with the explanation that a learning rate graph, this would appear as a sudden jump or a flat line at the point where the NaN loss occurs, as the learning rate can’t be adjusted effectively anymore.
I am sharing a pic of my end output of training with graph where the loss to lr graph showed a distinct minute line at 30 and eventually becoming flat line