Why can’t we choose the pair where s(A,N) > s(A,P) as the closest negative?

Amiao_Gao · February 8, 2022, 12:56am

I am a bit confused when choosing the triplets to calculate for the cost function.
Why can’t we choose the pair where s(A,N) > s(A,P) as the closest negative?

arvyzukai · February 8, 2022, 6:38pm

Good question. I think we are excluding these negatives according to “Reading: Triplets” section where it states that Hard negative triplet: cos(A,P) < cos(A,N).

Later in the course when we calculate Modified Triplet Loss we use this loss (closest_neg) equaly weighted with mean_neg to get the gradients for our RNN weights’ update.

So to the point why we update our RNN weights with respect to the closest negative example but only under the cos(A,P) value - we want to update with respect to “hard” example which must be closest but under the Anchor value - this way we update our RNN in a way that during next step we will have these vectors further apart.

It is easier to explain with numbers, for example (when alpha=0.25):

s0(A,P) = 0.11 , s1(A,N1)= 0.12 , s2(A,N2)= 0.08 , s3(A,N3)=0.4, s4(A,N4)= -0.9

L1 = (-0.11 + (-0.075) + 0.25) = 0.065, L2 = -0.11 + 0.08 + 0.25 = 0.22, so L = 0.285

For illustration purposes let’s assume that after gradient descent the same examples would have these values:

s0(A,P) = 0.13 , s1(A,N1)= 0.119 , s2(A,N2)= 0.05 , s3(A,N3)=0.039, s4(A,N4)=-0.901

So after this step we would start to push the positive and N1 further apart, not the N2 as in previous step.

In other words we try to push the hardest negative example from the positive while trying to maintain balance: mean_neg loss pushes all negative examples away from positive, and closest_neg pushes away only closest “hard” negative from positive.

I guess real world scenarios would tell you if you should or not exclude these negatives but I guess this was the motivation - to try to go for the closest negative example - “hard negative” which must be under the s(A,P) value.

Topic		Replies	Views
Triplet loss methodology is inconsistent within the lesson NLP with Sequence Models week-4	1	613	July 7, 2022
Question on hard triplet loss NLP with Sequence Models week-3	4	36	January 27, 2025
C3_W3_Exercise2 NLP with Sequence Models week-3	5	100	July 29, 2024
Question about C3_W4_lecture_nb_2_Modified_Triplet_Loss NLP with Sequence Models week-4	5	455	July 16, 2023
Siamese networks, question duplicate assignment classify and loss function failing test cases NLP with Sequence Models week-3	2	40	August 8, 2024

Why can’t we choose the pair where s(A,N) > s(A,P) as the closest negative?

Related topics