Saimese Networks Triplets Inferennce

In Siamese networks trained with triplet loss, three images (anchor, positive, and negative) are input during training to obtain embeddings for each image. These embeddings are then used to compute the triplet loss function, which encourages the model to place the anchor and positive images closer together in the embedding space compared to the anchor and negative images.

However, during inference, the model is used to determine the similarity of a test image to a set of images in a database. In this case, only one image (the test image) is input to the model at a time to get its embedding. There are no anchor, positive, or negative images during inference.

How does the inference process work with a training architecture designed for three inputs and three embeddings, and how does it handle the comparison of the test image embedding against a database of embeddings?

Hello, @Omar_Aziz,

Though the architecture is designed for three inputs at training, each of the inputs actually goes through the same network. In other words, at inference time, we just pass the test image to that only network for a face embedding.

On the other hand, we could have stored every person’s face embedding to a vector database which we use to compute the similarity to the test image’s embedding.

Cheers,
Raymond

2 Likes

Or to state Raymond’s points with different wording, here’s how I’d say it:

The point is that we are training the inference function which takes a face image as input and produces an embedding vector characterizing the input face. The “triplet loss function” is just the loss function we use to get the inference function to have the correct behavior. It expresses what we are trying to achieve with the training: to create an inference function that is effective at recognizing the distinguishing characteristics of different faces. So the inference function gets executed three times for every iteration of training. But once we have used the triplet loss function to train the inference function, then we can just use it in pure inference mode.

Once we have the trained inference function, we can then use it in different ways, as Prof Ng discusses in the lectures and we will see in the assignment for this week.

2 Likes