C4_W1 UNQ_C9 computing the average

My function only get as correct if I calculate the score (average) in this way:
score = overlap / index_sample

But the index always will have the length - 1. If there are 3 samples, the index_sample will be 2 at that momento. Should not it be 3 instead of 2 to calculate the average?

Hi @Bru_Silva

In the first paragraph of “4.2.3 Overall score”:

… As mentioned earlier, we need to compare each sample with all other samples. For instance, if we generated 30 sentences, we will need to compare sentence 1 to sentences 2 to 30. Then, we compare sentence 2 to sentences 1 and 3 to 30, and so forth. At each step, we get the average score of all comparisons to get the overall score for a particular sample. …

What it says, that you do not compare the sample with itself, so you need the similarity score average for other ( 29 = 30 - 1 ) samples.

Does that makes sense?

Hi Arvyzukai!
Thank you very much for replying.
It does make sense. I have looked to the function again, It seems that my confusion started because I thought there were 4 sentences when I saw “average_overlap(jaccard_similarity, [[1, 2, 3], [1, 2, 4], [1, 2, 4, 5]], [0.4, 0.2, 0.5])” but there are only 3 (the last thing is another parameter which is ignored by the function) so I could not understand why it was dividing by 2 instead of 3. I get it now. Thank you again.