What is acually going on inside create_pairs function

nahidalam · August 16, 2021, 10:03pm

I am having a hard time to understand the details of how the create_pairs function is doing its job in the C1_W1_Lab_3_siamese-network notebook.

for d in range(10):
        for i in range(n):
            z1, z2 = digit_indices[d][i], digit_indices[d][i + 1]
            pairs += [[x[z1], x[z2]]]
            inc = random.randrange(1, 10)
            dn = (d + inc) % 10
            z1, z2 = digit_indices[d][i], digit_indices[dn][i]
            pairs += [[x[z1], x[z2]]]
            labels += [1, 0]

For example - in the above code, for each i we are creating two pairs but only one labels. Why is that?

Wendy · August 17, 2021, 12:33am

@nahidalam, this code can be a bit tricky to get your head around at first, but there are actually two pairs and two labels in the code snippet you share. The “labels” in this case are for the pairs - the first pair has a label of 1 to indicate the two images in the pair have the same classification (e.g. are both shirts), and the second pair has a label of 0 to indicate the images are different.

The code counts on the data being passed to it to be organized in a particular way. You can see if you look at the code for create_pairs_on_set that it sets up digit_indices to organize the indices based on the classification of the image classification (e.g. shirt, purse, etc for the 10 different possible classifications). Then, create_pairs() makes a first pair by taking two images with the same classification, and then makes a second pair from different classifications.