Question about OneShot learning and Triplet Loss

I think the point is that once you have a trained system, you can use it for recognizing or comparing faces that it’s never seen before. Of course if you are seeing a person for the first time, that means you don’t have a picture of them in your database. But you can still imagine ways that your system could be useful. For example, suppose a person shows up at a kiosk in airport security and presents their passport. You could use the system to verify that the person in front of you matches the passport picture even though you don’t have that person’s picture in your database, right? You take a realtime picture of the person, scan the passport picture and feed both of them through your trained algorithm. That generates “embedding” vector outputs for both images. Now you can compute the 2-norm of the distance between those two vectors and have a threshold value at which you say “yes, this person is the real owner of this passport”.

But if you have deeper questions about this, there is a recent ongoing discussion of this general set of issues that’s worth a look.

1 Like