Confusion in Computing the Cost I

Hey Guys,

In the lecture video and the reading item entitled “Computing the Cost I”, we can see that it is a symmetric matrix, but this is not a auto-correlation matrix right?

For instance, assuming 1-based indexing, the value at (1, 2) will contain the similarity score of the sentences “Can you see me?” and “How old are you?”, while the value at (2, 1) will contain the similarity score of the sentences “What is your age” and “Are you seeing me?”. I agree that the sentences involved in the pairs are similar to each other, but still how can these similarity scores be same (apart from mere coincidences) for all the pairs?

Cheers,
Elemento

Hi @Elemento

If I understand your question correctly, your data_generator has to take care of it. Every pair of sentences with the same row index in the batch are similar to each other but different from all others (other rows) in the batch.
In other words if you have batch of:

[[v1_1, v1_2],
 [v2_1, v2_2],
 [v3_1, v3_2],
 [v4_1, v4_2]]

your data_generator has to make sure that:

  • v1_1 (anchor) is similar to v1_2 (positive)
  • and v1_1 is different from v2_2, v2_3 and v2_4 (negatives)

otherwise learning would not be possible (with the exception if negatives are not true negatives only on very rare occasions).

Cheers.

Hey @arvyzukai,
I believe there is some confusion. Let me present it very simply. Shouldn’t the above matrix be a non-symmetric matrix?

Cheers,
Elemento

@Elemento
In reality - yes - non-symmetric. But here it is presented for simplicity I guess.

Hey @arvyzukai,
Thanks a lot. I think changing this into a non-symmetric matrix will avoid this confusion for the learners. What do you think about this?

Cheers,
Elemento

1 Like

Hey @Elemento

To be fair I’m not sure. Maybe for some but not for others. For the lecture (video/reading) personally I think it is fine as is but I see your point. I think the subsequent Lab is for the details.

Cheers