Hi,
The instruction for “train the model” section indicates that we should train the model until reaching 320 of mean loss, but the loss usually starts at 150ish for my model. After 100 epochs, the loss reaches 140 but the structure similarity is below .6.
I’m not sure how to adjust my model so that the structure similarity can pass .6. Any hints will be really helpful. Thank you!
Michael
Hello Michael,
You do not need to use 100 epochs. You can share your notebook via personal DM, although what you could try is go first with 20 epochs, then increase to your desired output. I got the desired result for 40 epochs. It can vary based on your model. If you are still having issue, please share your notebook, to have a look.
Keep Learning!!
Cheers
DP
Michael,
pointers to clear assignment
I hope you have read these instructions and followed the same for the encoder layers
please use the Functional API to stack the encoder layers and output mu
, sigma
and the shape of the features before flattening. We expect you to use 3 convolutional layers (instead of 2 in the ungraded lab) but feel free to revise as you see fit. Another hint is to use 1024
units in the Dense layer before you get mu and sigma (we used 20
for it in the ungraded lab).
Note: If you did Week 4 before Week 3, please do not use LeakyReLU activations yet for this particular assignment. The grader for Week 3 does not support LeakyReLU yet. This will be updated but for now, you can use relu
and sigmoid
just like in the ungraded lab
-
Also for the decoder layer, for the last layer, you need to use filter according to the vae model which is 3.
-
for the training loop cell, there is another hint which you need to use in the cell
- compute the reconstruction loss (hint: use the mse_loss defined above instead of
bce_loss
in the ungraded lab, then multiply by the flattened dimensions of the image (i.e. 64 x 64 x 3).
Hoping you have all followed and started training from 30 epochs based on your model, you will get the result if coded properly.
If still having issue, let me know.
Regards
DP
Hello Michael,
There are mistakes in your notebook. Please rectify these mistakes by going through the matched ungraded lab for the same course
-
you have not named each encoded layer separately in the encoded layers cell. Instructor Lawrence explained about this part very clearly.
-
You mentioned decoder shape, where you feed dense network, you need to define the layer by name decoder_dense1.
after which in the reshape layer you mentioned decoder_reshape but after that you need mention for each layer you passed through network as decode_conv2d_1 and as next for each decoded convolution layer you need to name it. (refer ungraded lab again for this correction, you will find the solution)
-
Your training loop where you recalled epoch to 40 now, has code error for the compute reconstruction loss. there is a typo error. Refer to the ungraded lab, you will find the solution.
do these changes and run the model again.
let me know if issue is resolved.
P.S. Ungraded labs are very helpful in passing these assignments as well as the hints before these graded cell.
refer this ungraded lab to help you to the correction.
C4_W3_Lab_1_VAE_MNIST.ipynb (23.7 KB)
Regards
DP
1 Like
Hi Deepti,
Thank you for the reply.
I added layer names for both encoded and decoded layers, but couldn’t spot the typo error for the loss. The only difference I could find is that mse_loss is used in my assignment (provided by instruction hint), while the lab uses bce_loss.
I retrained the model anyway and the loss still starts at 140ish, and still, structure similarity is below .6.
This assignment really gave me a headache as I’ve been struggling for so long and referred to the lab before I post a question here, yet still failed to pass the requirement score, which never happened before. Please help. Thank you!
Michael
send me your updated notebook. DM, don’t post the notebook here.
Hello Michael,
Mistakes in your updated notebook
- Why did you add batch normalisation after last convolution layer in decoder layer exercise cell?? Do you know why batch normalisation is done and when it is done?
- You did the typo correction in the training loop but change unit layer to wrong layer units (correct units 64 * 64 * 3). Typo mistake means there is word error not unit mistake. In the previous notebook you had mentioned correct unit which now you changed to 12288
Please check you inbox.
Once you have cleared the assignment, just let me know.
Regards
DP
Michael
As you know or better one should know Batch Normalisation is added after a convolution layer to normalise the layers by re-centering or rescaling to make the training faster.
Batch normalization is achieved through a normalization step that fixes the means and variances of each layer’s inputs(mini-batch)
So if one add a batch normalisation to the last layer of your convolution layer, it hampers your predicts and that is why your model is not training properly.
Happy and Keep Learning!!!
Regards
DP
Michael
Please let me know once you have passed the assignment