Doubt about dataset for building a face recognition system

In 4th week of the CNN course, Andrew told about one-shot learning video and how you can use one picture of a person and use a similarity function to differentiate between the fake and the real. But later in the triplet loss video the dataset he mentioned for training was about 10K pictures of 1K persons(which means 10 pictures of each person).

I have this doubt that can’t we train a C.N.N. on single images of an employee? Say I have 10 employees and I have 1 picture of each, so can I apply C.N.N with one-shot learning with this small dataset? since asking an employee for multiple pictures would not be a good idea.


You could certainly work with 1 picture per employee, HOWEVER working with more than one, for instance, 3, 5, 10, will be better.

Why? Because you will have 3, 5, or 10 vectors about that person, so if the person grows the hair, or the beard or is facing up or down or slightly to the side, or … any other variation, chances are that with 1 single vector to compare you may have a false negative. But if you have more than 1 vector, then chances are that there will be a true match.

It all stems from the method to match faces: comparing embeddings, which are basically vectors with numbers. So when to embeddings are close (small to none distance) then we could conclude that we are talking about the same person.

Does it make sense?



1 Like

Thanks and another doubt is that, Is one-shot learning used to test the model trained on the siamese network?

Siamese networks are indeed one of the ways to implement one-shot learning.

Thank you very much. :black_heart: