Face recognition with siamese network and triplet loss does not learning useful patterns

i am new to this field and i am trying to make an alogrithm using triplet loss and siamese network to make a face recognition and the problem is that the loss value does not decrease lower than the margin of the triplet loss i’ve tried 4 networks that may solve the problem and i 've tried resnet50 network and i had the same issue tried to change the learning rate to lots of values and the issue is the same tried regulizations like dropout and did not change anything the loss value does not decrease and when i try to get the L2 distance between any 2 images the distance between are almost 0 ,i’ve tried this dataset to get the images Face Recognition Dataset - Oneshot Learning | Kaggle and i ran this code that i made read the images then i ran second code to transform the dataset to make me the anchors and positive and negatives this first code to read the images from the harddisk

#iterations is a number of different people in the dataset 
iterations= len(os.listdir(dataset_path))
for main_path in range(iterations): 
    #then  i store the current path to use it later when i try to access individual image to read it 
    current_path = os.path.join(dataset_path,str(main_path))
    
    #each person in the main path has a 72 different/positive images
    for sub_path in range(len(os.listdir(current_path))):
        
        
        full_img_path = os.path.join(current_path,str(sub_path))
        
        #after getting the full path of the image i read it using cv2
        img = cv2.imread(full_img_path+".png")
        
        #some images in my previus dataset dimensions were wrong so i used this if statement to make sure that all of the dataset images are in the same  dims
        if img.shape ==(112,112,3):
            rgb_image = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
            resized_im = cv2.resize(rgb_image,(round(128/2),round(128/2)) )
            lst_imgs.append(resized_im)
    
    imgs[main_path] = lst_imgs.copy()
    lst_imgs.clear()```

after storing the images in dictionary called imgs where each key is the main path and the value is a list of 72 images of the same person

i used this code to tranform to code to anchors , positives and negatives

```anchors = []
positives = []
negatives = []

for main , sub in imgs.items():
    # for each key/person in the imgs dict i get the positive images and suffile them 
    choices = sub.copy()
    random.shuffle(choices)
    for choice in choices:
        
        #then i append the first iamge as an anchor for the other 72 positive images 
        anchors.append(sub[0])
        
        #and for each one of the 72 suffled choices i append the suffled one as a positive image
        positives.append(choice)

        #to get the negative image i get a copy of the dataset 
        negative__images  = imgs.copy()
        #then i exclude the current person because it would be positive of i used the current person as negative image for that person
        del negative__images[main]
        #then chaining all of the images as 1 list with len = (72 * len(imgs) - 72) then suffle them and chooce a random choice of them
        neg_choices = list(itertools.chain(*(value for value in negative__images.values())))
        rand_negtaives_choices = random.choice(neg_choices)
        negatives.append(rand_negtaives_choices)```

 then i converted the lists to numpy arrays

```anchors=np.array(anchors)
positives=np.array(positives)
negatives=np.array(negatives)```

this is the triplet loss that i used in all of the tests and i tried to change it and get it from tfa 
this is the model that i used

```def create_siamese_network(input_shape):
  inputs = tf.keras.Input(shape=input_shape)
  x = tf.keras.layers.Conv2D(64, (10, 10),padding='same', activation='relu')(inputs)
  x = tf.keras.layers.MaxPooling2D((2, 2))(x)
  x = Dropout(0.3)(x)

  x = tf.keras.layers.Conv2D(128, (7, 7),padding='same', activation='relu')(x)
  x = tf.keras.layers.MaxPooling2D((2, 2))(x)
  x = Dropout(0.3)(x)

  x = tf.keras.layers.Conv2D(128, (4, 4),padding='same', activation='relu')(x)
  x = tf.keras.layers.MaxPooling2D((2, 2))(x)
  x = Dropout(0.3)(x)

  x = tf.keras.layers.Conv2D(256, (4, 4),padding='same', activation='relu')(x)
  x = tf.keras.layers.Flatten()(x)
  x = tf.keras.layers.Dense(4096, activation='relu')(x)
  # embedding = tf.keras.layers.Dense(1024,activation='sigmoid')(x)
  embedding =  tf.keras.layers.Lambda(lambda x: tf.math.l2_normalize(x, axis=1))(x) # L2 normalize embeddings
  model = tf.keras.Model(inputs=inputs, outputs=embedding)
  return model


siamese_network = create_siamese_network(input_shape)
input_shape = (anchors.shape[1],anchors.shape[2],1)
anchor_input = Input(shape=input_shape)
positive_input = Input(shape=input_shape)
negative_input = Input(shape=input_shape)

embedding_anchor = siamese_network (input_anchor)
embedding_positive = siamese_network (input_positive)
embedding_negative = siamese_network (input_negative)



output = tf.keras.layers.concatenate([embedding_anchor, embedding_positive, embedding_negative], axis=1)


siamese_network = Model(inputs=[input_anchor, input_positive, input_negative], outputs=output)
new_model_1.compile(optimizer=Adam(learning_rate=0.000002), loss=triplet_loss,)
labels = np.zeros((8200,))
new_model_1.fit([anchors,positives,negatives],y=labels,epochs=5,batch_size = 30)
2/2 [==============================] - 86s 42s/step - loss: 0.9042
Epoch 2/20
2/2 [==============================] - 78s 40s/step - loss: 0.2126
Epoch 3/20
2/2 [==============================] - 78s 39s/step - loss: 0.2098
Epoch 4/20
2/2 [==============================] - 76s 38s/step - loss: 0.2057
Epoch 5/20
2/2 [==============================] - 80s 39s/step - loss: 0.2047
Epoch 6/20
2/2 [==============================] - 77s 38s/step - loss: 0.2042
Epoch 7/20
2/2 [==============================] - 79s 39s/step - loss: 0.2029
Epoch 8/20
2/2 [==============================] - 77s 38s/step - loss: 0.2028
Epoch 9/20
2/2 [==============================] - 81s 42s/step - loss: 0.2024```

Have you tried posting your question on the Kaggle discussion forum for this dataset?

2 Likes

no i have not but i posted it in stackoverflow and i added the l2 norm at the last layer after the linear layer and it worked but i dont really understand why it was that cruical that important for it to work