C1W1 Lab 3 Euclidian Distance

Hi,

In the euclidian_distance function, why is the last line

return K.sqrt(K.maximum(sum_square, K.epsilon()))

instead of simply

return K.sqrt(sum_square)

? The latter seems to work just fine for me and is the actual definition of euclidian distance. Are there actually cases where sum_square could be negative for some technical reason (how floats work or such)?

Thanks :slight_smile:

Hi @Malte_Dehling, welcome!

Good question. The problem is not that sum_square might be negative, but that it could be 0, which could cause a problem if you tried to divide by 0, for example. To avoid this, if sum_square is 0, or very close to it, we use the small epsilon value instead.

As an experiment, you can try this out for yourself by forcing sum_square to be 0. You could do this for example, if you change K.square(x - y) to K.square(x - x) in the line above, and then use your simple return line without the epsilon. Then check out the value for loss when you run the model.fit cell

3 Likes

Hi @Wendy ,

thanks for you help!

Ok, so I did that experiment and end up with a NaN loss. I don’t quite understand why, though.

The output of our model (y_pred) is the euclidean distance between the output of the two subnetworks, which with this change is always 0. Now the loss I see should be the contrastive_loss(y_true, y_pred). Since contrastive_loss doesn’t divide by y_pred, I don’t see where the NaN result comes from. Any hints? Clearly I’m missing something :blush:

Thanks!

TBH, I was not thinking so specifically about which calculation was having the problem. The main point was that a zero can lead to undefined outputs in some math operations, which is why including an epsilon fudge-factor can help. I suggested the experiment just as a way to see that things don’t work right if you use just 0.

The problem doesn’t happen in the contrastive loss function itself, but in the calculations handled for you by Tensorflow and keras when you train the model. Then they pass the bad values along to contrastive_loss as parameters. I don’t know exactly where the issue is, but one way to think about it conceptually is to think of the slope of the line between the two points in your Euclidean distance function. For backprop, this slope is well-defined if the two points are different, but undefined if the points are the same.

I hope this helps

1 Like