In Week4, video titled “Disentanglement”, it is stated that the class information is embedded into the noise vector z and so the one hot encoding can be discarded. But this is not explained very well.
I presume that this is after the training step.
We modify z based on the pre-trained classifier. Is this why the class information is embedded into the noise vector?
It would be nice if the mentors could clarify a bit more on this.
A great question @manikb. I would also love to know the answer to this question. Additionally, if possible do tell how do we embed the class information into the noise vector z, cause at this point it seems to be extremely confusing with Conditional GAN
@manikb and @elemento, here’s a stab at a little more explanation:
Imagine that you have a regular random noise vector, and you want to add to that a vector with a few extra values, each controlling one specific feature. This is similar to the concept of adding a one-hot vector for controlling the class (e.g. Husky). The reason you’d want to add these values is so that you could easily adjust those features in your generated images.
Exactly how to best encourage your model to use these disentangled input values is still an active area of research, and there are a variety of techniques. The video explains two general approaches:
Using a process similar to conditional generation, where the discriminator is given labeled data for these values.
Adding a regularization term to the loss calculation. There are a range of techniques calculating this regularization term.
In general, since we want the model to generate appropriate images based on these disentangled inputs, it is part of the model training to account for these inputs - similar to conditional generation. However, there are some nuances. For example, one technique for determining the regularization term uses a classifier gradient approach with a pre-trained classifier - so it’s using a pre-trained classifier during the training of the main model.