I’m using keras-tuner for deciding the optimum learning rate and optimizer for my use case. Following are the details:-
I run 5 trials with 3 epochs each. While training I noticed something unusual. Theoretically, in each trial a new pair of hyperparameters are picked from the search space that I defined and model is trained from scratch. But notice the anomaly in the 2nd trial below:
Training loss during the 1st epoch of 1st trial was around 0.96 which declined to 0.6 at the end of 1st trial. Now, the training loss shall have similar values at the start of 2nd trial since the model is trained from scratch with a new pair of hyperparameters, but it seems like training has resumed from the end of 1st trial (see the loss value at the beginning of 2nd trial). This is counter-intuitive. I read the official documentation which too states that
During the search, the model-building function is called with different hyperparameter values in different trial. In each trial, the tuner would generate a new set of hyperparameter values to build the model. The model is then fit and evaluated.
What then is the problem with in my case? Why does it seem like with each trial, instead of training the model from scratch, it simply continues from the previous trial?