NLP with sequence models C3_week3 Failing a unit test Question duplicates assignment part 2 triplet loss

Got a wrong triplet loss for inputs out: [[ 0.26726124 0.53452248 0.80178373 0.26726124 0.53452248 0.80178373]
[ 0.5178918 0.57543534 0.63297887 -0.5178918 -0.57543534 -0.63297887]], and margin 0.25
Expected:0.7035077,
Got:1.7499999920648641.

2 tests passed
1 tests failed - im little blanked out on this for now - im just posting my error as per rules if a mentor wants more lmk -

1 Like

Hi @Paul_katz ,

have you tried out all these previous solutions and nothing worked for you? Try printing your values out per step and compare it with this explanation. Get back here if nothing worked for you.

Regards

3 Likes

ok i need a little help - i get the theory justy haing trouble getting the code tofully cooperatr

1 Like

Hi @Paul_katz

Up to which point are your calculations correct and where do they start to differ?

1 Like

that is where i began my bug hunt the first discrepancy is at mask_exclude_positives i get the correct numerical rep but not true false true true i could make the latter happen but not using the suggested tensor flow fn

1 Like

The mask_exclude_positives is composed of two parts. As the code hint suggests:

    # create a composition of two masks: 
    # the first mask to extract the diagonal elements, 
    # the second mask to extract elements in the negative_zero_on_duplicate matrix that are larger than the elements in the diagonal 
    mask_exclude_positives = tf.cast((None)|(None),
                                    scores.dtype)

These two parts are true/false values as you can see in the image above.

To create the first mask (in place of the first None, you can make use of instructions:

To create the mask, you need to check if the cell is diagonal by computing tf.eye(batch_size) ==1

To create the second mask (in place of the second None), the instructions are:

if the non-diagonal cell is greater than the diagonal with (negative_zero_on_duplicate > tf.expand_dims(positive, 1)

These masks are compared with the | (“or”) operator between them (code already provided for you).

Finally they are “casted” to scores variable data type (this is completed for you too).

In summary, you just need to replace the both None’s with the code that is provided in the instructions.

Cheers

2 Likes

Feel a little embarassed now, i suppose i approached this out of sequence - thats an unintentional pun

1 Like

im gonna start with a fresh notebook again looking at your breakdown it looks like my numbers are coming in inverted versus yours can i show my ouptut from my print statements now i know max_exclude_positives code is right and the output doesn’t look right to me still - certainly the loss is off now and im not passing as many unit tests

1 Like

can i dm my code at this point

1 Like

you have to be kidding me - ive went into a rabbit hole over a comment error - disregard im solved but someone should do something about the suggested axis value on mean_negative i dont think that was helpful to getting better -

1 Like

Do you have this comment in mind?:

# use `tf.math.reduce_sum` on `negative_zero_on_duplicate` for `axis=1` and divide it by `(batch_size - 1)

If so, what do you find confusing or not helpful about it?

Cheers

1 Like

im just happy i got through - thinking it over next day axis =1 would not allow my assignment to pass i poured over some lines second guessing all i thought id learned -i may have put my foot in my mouth if so i apologize- in this case “axis = 0”, it means that the reduction operation is performed vertically, resulting in a single value for each column of the input tensor. is that right? the assignment would not pass with 1 as value so now im wondering if something prior is off - if its supposed to be 1- maybe there is teachable moment yet to come -

1 Like

Your text is hard to read.

That is true. The axis= parameter specifies on which axis the reduction is performed. For 2D tensor that would mean reducing “vertically”.

For the first step - do you get the exact values in their places as in the picture?
Note that v2 is not transposed, while v1 is.
In other words, if you use tf.linalg.matmul() with parameter transpose_b=True, then the order matters - it’s v2, v1 and not v1, v2.
In one case the results are as in the picture, while in another case the result would be “flipped” (non diagonal elements would be switched).

1 Like

well i finally got it - omg im usually trepidatious to even come to forum for help - but now im sure the 3 days werent wastedso simple in hindsight - thanks Arvyzukai - wow im blown away how long it took me - all the flags were saying it was “flipped”

2 Likes