Week 4 >>> Face_Recognition programming exercise >>> Exercise 1 - triplet_loss

Jaime_Gonzalez · February 11, 2022, 9:43am

In [[[Week 4 >>> Face_Recognition programming exercise >>> Exercise 1 - triplet_loss]]] I get the error:

AssertionError                            Traceback (most recent call last)
<ipython-input-24-3eedf9f59da2> in <module>
    18 y_pred_perfect = ([1., 1.],[1., 1.], [0., 0.,])
    19 loss = triplet_loss(y_true, y_pred_perfect, 3)
---> 20 assert loss == 1., "Wrong value. Check that pos_dist = 0 and neg_dist = 2 in this example"
    21 print('-----------------------------------------')
    22 

AssertionError: Wrong value. Check that pos_dist = 0 and neg_dist = 2 in this example

These are my results:

pos_dist tf.Tensor(0.0, shape=(), dtype=float32)
neg_dist tf.Tensor(1.9999999, shape=(), dtype=float32)
basic_loss tf.Tensor(1.0000001, shape=(), dtype=float32)
loss tf.Tensor(1.0000001, shape=(), dtype=float32)

As you can see, the answer is off by just 0.000001 because neg_dist is off by 0.000001. I can’t see how my code could be wrong though, perhaps this is a problem with the grader?

My neg_dist code:

neg_dist = tf.square( tf.norm(tf.subtract(anchor, negative), axis=-1) )

Thanks for your time

Jaime

paulinpaloalto · February 11, 2022, 4:52pm

You probably lose resolution by computing the norm and then squaring it. You potentially introduce more rounding errors when you do that. Just compute the squares in the first place and then do reduce_sum and you’re there. No norm required. And as a side benefit, your code will be more efficient …

It is an interesting and legitimate question why they made the assertions here check for exact equality. That’s always a little dangerous in floating point. In a lot of cases they use allclose instead, but not here for some reason. Hmmmmm.

TMosh · February 11, 2022, 5:06pm

Probably should be allclose().

Jaime_Gonzalez · February 12, 2022, 2:26pm

Your solution works, but I’m having trouble understanding the difference between tf.reduce_sum and tf.norm. I understand what each one does:

tf.reduce_sum squares each value in the axis selected, adds those squares, and then calculates the square root of that sum
tf.normsimply adds each value in the axis selected

But what is the practical difference between the two. Why would you ever use tf.norm instead of the simpler and more efficient tf.reduce_sum? Also, when applied to vectors the result of tf.norm represents the distance between the vector and 0. Does tf.reduce_sum represent anything visual?

paulinpaloalto · February 12, 2022, 3:31pm

I think you have the function names reversed there. The definition of the norm is the 2-norm. That is the Euclidean length of a vector:

||v|| = \displaystyle \sqrt {\sum_{i = 1}^{n} v_i^2}

Reduce sum simply computes the sum across the specified dimension, but in the way we are using it, what we are computing is:

loss = \displaystyle \sum_{i = 1}^{n} v_i^2

There is no square root there. Because the loss here is defined as the sum of the squares, that’s all we need. Of course we can see from the above that:

loss = ||v||^2

You use the norm when what you want is the norm. That’s not what we want here. Of course what you did ends up with the same result by adding one more step to take advantage of the latter formula above: squaring the norm to remove the square root and end up with the sum of the squares. That is both a) a waste of computation, because computing the square root is actually a pretty expensive operation and b) it introduces more computations and thus more opportunities for rounding errors to accumulate because nothing we do in floating point is exact. It was that latter point of course that caused you to fail the test, even though I think we all agree that they should have used allclose rather than testing for exact equality there.

Jaime_Gonzalez · February 13, 2022, 10:21pm

Great response, thank you

I’ll summarise for the sake of my own understanding:

Using tf.norm I was really doing: X2= X1.^2 —> X3= reduce_sum(X2) —> X4= sqrt(X3) —> X5= X4^2
Using your method I am now doing: X2= X1.^2 —> X3= reduce_sum(X2), which removes the last two unnecessary steps which would make the calculation both more inefficient and more prone to rounding error

Topic		Replies	Views
Course 4 Week 4 project 1 Convolutional Neural Networks	26	1690	April 11, 2022
C4:W4 Face Recognition: Exercise 1 - triplet_loss Convolutional Neural Networks	2	420	August 15, 2023
Deep Learning Course 4 week Convolutional Neural Networks	4	334	October 12, 2023
Help with DLS Course 4: Week 4 : Programming Assignment: Face Recognition Convolutional Neural Networks	1	507	February 14, 2022
Issues with triplet loss function Convolutional Neural Networks	19	703	March 7, 2023

Week 4 >>> Face_Recognition programming exercise >>> Exercise 1 - triplet_loss

Related topics