C2M3 Assignment exercise 6

Exercise 6, even though I used the hyperparameters that

  • layers_to_train = 4
  • learning_rate = 5e-5
  • num_epochs = 5

cannot pass the test.

the results:
— Training complete —
Final Validation Metrics

Loss: 1.6080
Accuracy: 0.5141
F1: 0.1358


exercise 1 through 5 were passed.

Any help would be appreciated.

Hi @Takako,

I have just re-ran the entire assignment with your hyperparameters on my site, and unfortunately (or fortunately :winking_face_with_tongue: ), I was not able to reproduce your Final Validation Metrics (I mean, qualitatively, of cause). But I have a few thoughts which may help. Even I successfully used a different set of the hyperparameters, when I took this class myself, I would not suggest you to play with the hyperparameters in this situation, because your final numbers are way off even for the first epoch having the suggested set of the hyperparameters.

So instead I would:

  1. double check and confirm, that all outputs after each excise verification cells match the Expected Outputs (not only unittests’ confirmations)
  2. double check all cells where the model, which is finally trained, is loaded and defined/adjusted
  3. restart Kernel and re-run the entire exercise without interruptions, i.e. all consequent executable cells should have consequent [In]/[Out] numbers.

And please, share the actual cause of the problem, when you identify it.

1 Like

hi. @Takako

can you post screenshot of exercise 4 and 5 output. Please make sure not to post any codes here.

Also when I ran down with hyperparameters i run down with 3 layer and 3 epochs, but your issue is more related to previous! exercise where you select which layer to unfreeze as instruction mentions to use negative indexing to get the lastN layer of transformer, i.e. -(i+1))

if you have done the correct way then check in excercise 4, compute class weight, if you have used the right variable names to get the unique class labels and training! labels. Remember training labels aree assigned as train_labels_list and you use the same variable name to get the unique class labels using np.unique.

Regards
DP