Improvement of assignment write up

The assignment asks us to stop training when MSE loss reaches 320 and below.
When I did this, the grader failed my submission due to insufficient structural similarity.

Later, when the vae.losses were also factored in, the submission passed.

If you could revise the write up by mentioning that the overall reconstruction loss should be < 320, that’d be great.

After training till the loss < 310, the structural similarity was .66. Not sure if the tests have been revamped wrt to the assignment description. But, less than 310 could be a good point to stop training.

Cheers.

1 Like

Follow up on this comment.

I tried with 200 epochs. The loss ended up around 180. But the restructured image is really off from the training data. It makes sense that the submitted model failed the grader.

After a few attempts, I noticed that the model’s output looks the best when the training loss is around 310. I manually stopped the running code, submitted the model and the grader passed this time.

My question is that why would the loss training loss model end up generating a worse model? Is the proper training method to monitor the sample output images and stop when they look good?

Thanks,
Yusa

1 Like

hello balaji can I know how many epochs you used for this assignment. I am doing the same assignment.

I don’t remember. Adding @gent.spah

1 Like

Balaji, I have a doubt, first when I was running the epoch for 5, it didn’t go to the required epoch. then I made epoch to 100. This time, epoch was running fine till 98 at which mse was 487 but then as it turned 99, mse turn nan and the training loop stopped. then after a while I tried again at epoch 50 but ran the notebook from beginning, it this time it was running properly and I was about to achieve the desired mse around epoch 40 but my power got cut off and I had to train again the model. At this time, again my epoch was at 40, but what I noticed mse started suddenly at 10,000 which was earlier beginning at 1500, 1100.

Can I know the reason of fluctuation as I am looking for solution for this problem.

My doubts are as follows-

  1. Why every time I start training a loop, mse differs with the same code as I didn’t change anything.

  2. I also noticed that as the epochs go higher, the learning rate tends to go slow. So does it mean keeping the epochs less and learning rate higher is better to train a model.

  3. As I get this was a variation model, is that reason for the fluctuation in the training model???

@balaji.ambresh @paulinpaloalto

Thank you in advance
DP

Hello @Deepti_Prasad the number of epochs should be similar to the ungraded lab but even maybe fewer number would do the job. You could try progressively based on your intuition.

1 Like