'Sticky' model not retraining after optimal result

tbonavia · April 18, 2024, 4:55am

:Colab code
I tried running through a few parameters for fun using the colab code provided and my optimizer is behaving differently for arange of thresholds than when the thresholds are initially provided. See the attached screenshot. Model prediction for a threshold of 1.02 is [[18.767128]]. When I loop with arange the outputs all ‘lock’ on the best value [[18.999973]] once it is reached, regardless of the threshold passed in.

tbonavia · April 18, 2024, 4:57am

Just curious why this would be happening - I’m at a loss
Note the reset_states() doesn’t do anything, it’s just something I was trying.

gent.spah · April 19, 2024, 5:40am

I am trying to find this assignment, which one is it exactly!

gent.spah · April 19, 2024, 5:43am

is_small_error = tf.abs(error) <= threshold
small_error_loss = tf.square(error) / 2
big_error_loss = threshold * (tf.abs(error) - (0.5 * threshold))
return tf.where(is_small_error, small_error_loss, big_error_loss)

The output calculation depends on the threshhold, maybe you should trace it there to see the difference!

tbonavia · April 20, 2024, 3:30am

Thanks for looking at it. Yes perhaps some good debugging. It seems to me that the output for say a threshold of 1.02 should be the same regardless of whether it is ‘one off’ or in a sequence however that is clearly not the case. The loss calculation code is very straightforward so my guess is that the predictions are reaching an optimal maximum, something along the lines of the predictions not being initialized - that doesn’t quite ring true but it is some kind of unexpected internal behaviour. I shouldn’t be too obsessed with it but it might be relevant if we start some kind of gridsearch - the results from a series of compile/fit iterations not matching the results once those optimized parameters are used on their own. For the benefit of others the lines preceding the is_small_error… above are:
def call(self, y_true, y_pred):
error = y_true - y_pred

TMosh · April 20, 2024, 3:34am

“error” may be a reserved keyword. Try using a different variable name.

tbonavia · April 20, 2024, 3:37am

Aaah pretty much found it if not quite understanding the exact behaviour. Compile and fit did not fully reinititalize the model - I needed to create a new model each iteration of the loop. Something to keep in mind if manually creating some kind of custom grid search.

tbonavia · April 20, 2024, 3:40am

Good suggestion - don’t think that was the root cause here but a good tip to keep in mind.

tbonavia · April 20, 2024, 3:51am

So somewhere around 2500 epochs of training the model maxes out with the result 18.999973 - my loop was in effect just more epochs of training of the same model.

Deepti_Prasad · April 20, 2024, 3:56am

Did you try using a different loss function? As I can see your model has a single dense layer with a unit of 1, probably using a different optimizet might also lead to a different outcome. Also you have not mentioned about what parameters you are using as I can see your input shape is 1. Please share some details about the dataset you are using, what model you created and what all different ways you have tried.

Labelling and your model quote don’t match probably the reason of your error.

Regards
DP

tbonavia · April 20, 2024, 4:21am

Thanks Deepti, actually this is a simple lab to try a custom loss function. I do think that the behaviour is caused by not recreating the model so in effect we just keep training and improving. I note that pretty much regardless of other parameters the output in the first loop of the second cell will improve upon the first training cycle done in the first cell before the looping. Really just a misunderstanding on my part of how to create and test a new model - compile/fit simple do not reinitialize everything.

Topic		Replies	Views
My accuracy is to low? Introduction to TF for Artificial Intelligence ... week-2	3	539	March 18, 2022
Model training does not stop when accuracy reaches a particular threshold Introduction to TF for Artificial Intelligence ... week-2	5	659	February 10, 2023
C3W4 assignment: cannot get the required 0.95 accuracy Advanced Computer Vision with TensorFlow week-4	11	516	January 10, 2024
Zombie detector : No error but the loss still doesn't want to decrease below 0.5 Advanced Computer Vision with TensorFlow week-2	2	578	June 7, 2022
Discrepancy in Prediction Values with Huber Loss: Wrapper vs Class-based Execution in TensorFlow Custom Models, Layers and Loss Functions with TF week-2	8	14	September 21, 2024

'Sticky' model not retraining after optimal result

Related topics