I am a bit confused on how to apply the siamese network. Prof. Ng says in one of the videos that there are several models available which have already been trained on millions of people. If I like to use that at my own company then, do I then just take that model, input a (one?) picture of each employee, and get an “encoding vector” (the 128-vector in the videos) for each employee? 1000 employees gives 1000 “encoding vectors”? And comparison has to be made to all of these 1000 vectors each time an employee whats to pass the recognition system? Correct?
And, as the original model of +1e6 pictures is trained, surely some sort of one hot y-vector must be used, holding the ground truth over which persons are actually fed to the network? I got a bit confused when Prof. Ng just drew a line over the softmax.