C4W3_Assignment.ipynb : unittest error – expecting MSE instead of Huber for loss

rocki · December 10, 2024, 10:01pm

Hello,

The unittest case is expecting MSE instead of the specified Huber. See error below:

There are a couple of issues with this unittest.

First, can you please clarify why is the loss forced to use only MSE and not anything else?

Second, most importantly, none of the model architectures and hyper parameters combination when used with MSE loss function are able to satisfy the plot bounds [1e-6, 1, 0, 30] as given in:

Only Huber is able to provide loss values that falls within bounds as below:

This makes me question if the unittest to expect only MSE is correct. Should it be Huber instead?

Deepti_Prasad · December 10, 2024, 11:05pm

mse most common used in regression tasks where we create algorithm to detect significance of continuous variables relation between independent and dependent variables.

Cross-entropy loss is often more interpretable in classification tasks whereas MSE may not always have a straightforward interpretation in regression tasks. MSE is sensitive to the scaling of the target values requiring data preprocessing, whereas cross-entropy is invariant to scaling.

Huber loss is used when there is outliers in the data. Basically combines mse and mae to provide better loss function and be less sensitive to outliers.

Did you have used ‘mse’ in the create model grade cell?

Huber loss was for the optional exercise for learners to understand learning rate significance i.r.t. to loss function used, but in the create model you are suppose to use ‘mse’

rocki · December 11, 2024, 12:36am

Yes, because the unittest forces use of MSE.

Due to unittest requirement, the MSE needed to be used in adjust_learning_rate() also to obtain the loss vs learning_rate plot.

MSE based adjust_learning_rate() produces the following or similar outcome at the best, where the loss very rarely falls within the (0, 30) range in Y axis:

The (0, 30) range is the primary reason I think this assignment is intended for Huber loss function and not MSE function.

If someone can confirm getting a better loss curve based on MSE function to fall within the (0, 30) range, I’d appreciate if you can share those hyper parameters with me for testing. (Note: I have submitted and passed the assignment with 100%, so the information will only be for edification.)

Or if there is a viewpoint that loss functions in adjust_learning_rate() and create_model() can be different please provide reasoning on why you think learning rate is not related to loss function.

Deepti_Prasad · December 11, 2024, 1:11am

in adjusting learning rate, you are suppose to use huber loss and create model, you are suppose to use mse based on the hyperparameter and data being trained. they both are different exercise.

rocki · December 11, 2024, 6:52am

The exercise clearly states “Based on this plot, which learning rate would you choose? You will get to use it on the next exercise”. Yet this concept of dynamic adjustment of learning rate seems to have been completely missed.

Deepti_Prasad · December 11, 2024, 7:11am

are you using Adam or SGD?

Note that it did mention using learning rate in the next exercise but it’s not mandatory but was exercised from the perspective of learning can experiment with

So based instructions Adam with learning seems to have had advised

Learning rate 0.09 means almost 0.1, rather using 0.001 or 0.0001

but learning rate hyperparameter is not a mandatory use in create model

rocki · December 11, 2024, 8:58pm

We will never resolve this issue because responses are not on the same side of the coin the question is

Typical responses have been merely to satisfy the unittest irrespective of if that unittest requirement makes sense or not.

Asking to treat the two models independent of each other just to meet the unittest requirement seems antithetical to the topics and lectures in this course.

No clarification has been provided to address the following queries:

Deepti_Prasad · December 11, 2024, 9:38pm

learning rate is ofcourse related to loss function depending on the data the model is created.

I am not stating learning rate cannot be used on create model, but in the assignment we are working on or you are talking about, it didnt require to use learning rate, more specifics not a mandatory hyperparameter.

but you can experiment if you want to use learning rate and see the results how the adjusting learning rate has an effect of using different loss function, that was the whole idea behind giving the optional exercise for the learners to explore

Topic		Replies	Views
Assignment 3: Which loss should be used? Sequences, Time Series and Prediction week-3	4	19	March 19, 2025
Is it forced to use binary_crossentropy? Generative Deep Learning with TensorFlow week-3	4	771	December 16, 2021
C4W4 Assignment: Cannot achieve MSE below 6, MAE below 2 Sequences, Time Series and Prediction week-4	31	973	April 15, 2024
Difference between MSE Loss and the metric MSE Advanced Learning Algorithms week-1	7	159	September 30, 2024
Test_compute_loss fails in my assignement Unsupervised Learning, Recommenders, Reinforcement week-3	4	493	March 10, 2023

C4W3_Assignment.ipynb : unittest error – expecting MSE instead of Huber for loss

Related topics