Hello community ![]()
Can we use any pretrained base deep learning model like ResNet, GoogleNet, VGG etc. to build a Siamese network?
Since these models are pretrained on different datasets and with different loss functions, will it be a good choice to pick such a pretrained model, as they aren’t trained using the Triplet loss?
I am building a Siamese Network in pytorch using Inception_v3 trained on a dataset with 1000 classes, hence it outputs a 1000-dimensional vector. I find the distance between the encodings of the anchor image and the test image and check if it is the same person. But it turns our that it’s not working well. How do we decide the minimum distance threshold?
Here is my code:
import torch
model = torch.hub.load('pytorch/vision:v0.10.0', 'inception_v3', pretrained=True)
model.eval()
from PIL import Image
from torchvision import transforms
def image_to_encoding(image_path, model):
input_image = Image.open(image_path)
preprocess = transforms.Compose([
transforms.Resize(299),
transforms.CenterCrop(299),
transforms.ToTensor(),
transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])
input_tensor = preprocess(input_image)
input_batch = input_tensor.unsqueeze(0) # create a mini-batch as expected by the model
# move the input and model to GPU for speed if available
if torch.cuda.is_available():
input_batch = input_batch.to('cuda')
model.to('cuda')
with torch.no_grad():
output = model(input_batch)
return output
def who_is_it(test_image_encoding, database):
"""
Implements face recognition for the office by finding who is the person on the image_path image.
Arguments:
image_path -- path to an image
database -- database containing image encodings along with the name of the person on the image
model -- your Inception model instance in Keras
Returns:
min_dist -- the minimum distance between image_path encoding and the encodings from the database
identity -- string, the name prediction for the person on image_path
"""
encoding = test_image_encoding
# Let's initialize "min_dist" to a large value, say 100
min_dist = 100
# Loop over the database dictionary's names and encodings.
for (name, db_enc) in database.items():
# Compute L2 distance between the target "encoding" and the current db_enc from the database. (≈ 1 line)
dist = np.linalg.norm(encoding.squeeze().numpy() - db_enc.squeeze().numpy())
# If this distance is less than the min_dist, then set min_dist to dist, and identity to name. (≈ 3 lines)
if dist<min_dist:
min_dist = dist
identity = name
### END CODE HERE
if min_dist/100 > 0.5:
print("Not in the database.")
else:
print ("it's " + str(identity) + ", the distance is " + str(min_dist/100))
return min_dist, identity