Controllable generation lab

In the W4 last lab the final code to fine tune generator creates a fixed noise vector and then iterate over noise vectors and find out score.

As you can see from this code

target_indices = feature_names.index("WearingNecktie") 
# bunch of other code
original_classifications = classifier(gen(noise)).detach()

Then there is a for loop that iteratively update the noise vector by comparing newly generated images with original classifications.

for i in range(grad_steps):
    fake = gen(noise)
    fake_image_history += [fake]
    a = classifier(fake)
    fake_score = get_score(
    print(f"updated weight {a[0, target_indices]}")

    fake_score.backward() = calculate_updated_noise(noise, 1 / grad_steps)

I wonder since the original classification of images has nothing to do with images having neckties, what does this code trying to do? Why do we always compare classification of newly generated images with original classification?

Hi Richeek!
Hope you are doing well.
Sorry for the late responses.

You may take a look at the get_score function :

The idea is to compare the classifications of the newly generated images with the original classifications to determine how much the target feature has been achieved.

The original classifications are obtained by passing the initial noise vector through the generator and then feeding the generated images to the classifier. Since the generator is not yet fine-tuned, the generated images may not have the desired target feature. By comparing the classifications of the generated images with the original classifications, the code can evaluate the progress made towards achieving the target feature.

During each iteration of the “for loop”, the generator generates new images based on the current noise vector. These generated images are then classified by the classifier, and the current classifications are compared with the original classifications. The get_score function is used to calculate a score that combines the target score (mean classification for the target feature) and a penalty based on the differences in classifications for other features (using L2 regularization).

The score is calculated to guide the optimization process. By maximizing the score, the generator is encouraged to generate images that have a high classification for the target feature while minimizing changes to other features (penalized by the L2 regularization). The goal is to find a noise vector that generates images with the desired target feature while preserving other features as much as possible.

So, the code compares the classification of newly generated images with the original classification as a way to evaluate and guide the fine-tuning process of the generator towards achieving the target feature.

Hope you get the point, if not feel free to post your queries.


Thanks for your reply Nithin. So in ideal case am I right to say that the maximum value of score is 1? (assuming trained classifier gives probabilities for each class and is giving perfect number of target class)

I printed the target feature score and it increases over successive iterations of for loop. If say original fake image was a close representation of the the target feature, will it make score reduces in successive iterations?

Also did we start with fake image just to initialise the neural network else we could have started with any numbers?

Yes, in an ideal case where the current_classifications[:, target_indices].mean() == 1 (matrix of 1s) and other_class_penalty==0, the total score will be 1 . This means that all the generated images are perfectly classified as having the target feature.

If the original fake image was already a close representation of the target feature, it is possible that the score would reduce in successive iterations. This can happen if the fine-tuning process causes some changes in the generated images that affect other features, leading to a decrease in the score. The L2 regularization penalty in the score calculation aims to minimize changes in other features, so if there are significant changes in those features, it can contribute to a decrease in the score.

Here also we are starting with random numbers only (it should just have the shape of the actual image). You can always start with random numbers with the shape being the same as that of original images. But as you mentioned, if we can start with an image which is closer to the actual image than the random noise, convergence might be faster.


Appreciate your response!

1 Like