while training loop, I am getting the below image error. So just a while ago I was unable to post anything, so when I searched on google what I got to know the error is because we need to use tensorflow2.5 but then it seems the training becomes too slow. I am totally confused now. Please Help! @paulinpaloalto
I am not familiar with any of the TF related specialization courses, so I don’t know the assignment you are doing here. But just looking at the code, note that the inputs to the loss function would typically be the outputs and the labels, right? It looks like you are passing the inputs and the outputs, so one would expect those to be of different shapes, right? The outputs will be (typically) 1 x m.
Also note that you are already computing the MSE loss (Mean Squared Error), so it’s the average of the loss across all the samples. What is the point if multiplying that by 64 * 64 * 3? That sounds like it might be the number of elements in one of the input images. Of course this is all with the previous disclaimer: I don’t know anything about the details of what you’re actually trying to accomplish here.
Or maybe it’s better to simplify the point by going back to first principles here. If you get a shape mismatch, then the first question is always “Ok, what are the shapes?”
What does that show? In the context of this problem, would you expect them to be the same?
this assignment is basically creating an auto encoder model, where encoder output is merge with the sampling layer which further gives latent representation. Then this latent representation is fed to the decoder network.
yes you are right the shapes differ. This is what it shows now
actually Paul in the ungraded lab, at the place of mse_loss, bce_loss has been used and multiplied with 784
but for the assignment section in the hint it is mentioned to use mse_loss instead of bce_loss and multiplied with (64 x 64 x 3)
so basically my issue is with the input and outputs, right?
Thank you replying
I’m not sure still why you’re multiplying by the size of what appears to be the hard-coded size of a specific 3D matrix. What would be the general theory behind that?
instructions given before training loop to follow
You can now start the training loop. You are asked to select the number of epochs and to complete the subection on updating the weights. The general steps are:
- feed a training batch to the VAE model
- compute the reconstruction loss (hint: use the mse_loss defined above instead of
bce_loss in the ungraded lab, then multiply by the flattened dimensions of the image (i.e. 64 x 64 x 3)
- add the KLD regularization loss to the total loss (you can access the
losses property of the
- get the gradients
- use the optimizer to update the weights
When training your VAE, you might notice that there’s not a lot of variation in the faces. But don’t let that deter you! We’ll test based on how well it does in reconstructing the original faces, and not how well it does in creating new faces.
The training will also take a long time (more than 30 minutes) and that is to be expected. If you used the mean loss metric suggested above, train the model until that is down to around 320 before submitting.
Thanks for the background info. I guess I need to get educated on how auto encoders work: that is new to me. Why it makes sense to multiply the loss by a fixed value is a mystery, but that’s what it tells you to do.
But as to the shapes, they are different. Apparently the flattened input is 8 million and change and the flattened output is 24 million and change. So does that make sense? You are evidently expecting them to be the same size. So something must be wrong, but I don’t know enough about the problem to help any more directly than that. You need to go back and review the earlier steps where you create or format the input values and output values. Where do those come from? If they’re supposed to be the same size, then you need to figure out why they aren’t.
Sorry, but this is how debugging works: you start with the evidence you see and then you have to work your way backwards one step at a time.
No problem Paul, thanks for your inputs. Yes I am doing the same, going back and checking where I might be going wrong.
ok now my model is training, the only change I did was changing the filter from 1 to 3 of the last dense layer of the decoder output( I had first used the filter 1 because it was used in ungraded lab) but then because then I noticed in the vae_model and get model, there is mention 64 * 64 * 3, so that is why I wanted to change filter to 3 and it worked. but the reconstructed image quality is still not so good. so can anyone suggest me what will be a good epoch for a better reconstructed image ( p.s. note in ungraded lab for the same assignment 100 epoch was used). mse should be 320
Hope to get better reconstructed images.