when Y_true == 1: L = Y_pred^2 while Y_true == 0: L = max(margin - Y_pred, 0) ^2
For Y_true == 1, if the model trying to minimize the loss L, should the Y_pred be closed to the 0?
and, for Y_true == 0, should Y _pred closed to 1?

It confused me because I expect the Y_pred should have the close value to Y-true. Anything I misunderstood here? Thanks for any help

Excellent question! Youâ€™ve put your finger on the thing that makes contrastive loss different from â€śtypicalâ€ť loss functions. Contrastive loss is a distance-based loss function. Remember how we made our final step of our model a lambda function that calculates the Euclidean distance between the two images? Instead of a Y_pred that is trying to match the value for Y_true, like you typically see, we have a Y_pred that is giving us a prediction for the distance between the two images.

Once you think of Y_pred as the distance between the images, itâ€™s easier to see how the loss will be minimized as the predictions improve:

When Y_true is 1, the images should be similar, with a small distance between them - so the larger D^2 is, the larger our loss is - which is what we want.

Similarly, when Y_true is 0, then images should be dissimilar, with a larger distance between them. If the distance is greater than the margin value, then they are dissimilar, and if the distance is less than the margin value, then the smaller the distance, the larger our loss, which is exactly what we get from max(margin-D, 0)^2