C1W1 Lab 3 Euclidian Distance

Malte_Dehling · March 7, 2022, 6:02pm

Hi,

In the euclidian_distance function, why is the last line

return K.sqrt(K.maximum(sum_square, K.epsilon()))

instead of simply

return K.sqrt(sum_square)

? The latter seems to work just fine for me and is the actual definition of euclidian distance. Are there actually cases where sum_square could be negative for some technical reason (how floats work or such)?

Thanks

Wendy · March 7, 2022, 10:48pm

Hi @Malte_Dehling, welcome!

Good question. The problem is not that sum_square might be negative, but that it could be 0, which could cause a problem if you tried to divide by 0, for example. To avoid this, if sum_square is 0, or very close to it, we use the small epsilon value instead.

As an experiment, you can try this out for yourself by forcing sum_square to be 0. You could do this for example, if you change K.square(x - y) to K.square(x - x) in the line above, and then use your simple return line without the epsilon. Then check out the value for loss when you run the model.fit cell

Malte_Dehling · March 8, 2022, 4:11pm

Hi @Wendy ,

thanks for you help!

Ok, so I did that experiment and end up with a NaN loss. I don’t quite understand why, though.

The output of our model (y_pred) is the euclidean distance between the output of the two subnetworks, which with this change is always 0. Now the loss I see should be the contrastive_loss(y_true, y_pred). Since contrastive_loss doesn’t divide by y_pred, I don’t see where the NaN result comes from. Any hints? Clearly I’m missing something

Thanks!

Wendy · March 8, 2022, 10:33pm

TBH, I was not thinking so specifically about which calculation was having the problem. The main point was that a zero can lead to undefined outputs in some math operations, which is why including an epsilon fudge-factor can help. I suggested the experiment just as a way to see that things don’t work right if you use just 0.

The problem doesn’t happen in the contrastive loss function itself, but in the calculations handled for you by Tensorflow and keras when you train the model. Then they pass the bad values along to contrastive_loss as parameters. I don’t know exactly where the issue is, but one way to think about it conceptually is to think of the slope of the line between the two points in your Euclidean distance function. For backprop, this slope is well-defined if the two points are different, but undefined if the points are the same.

I hope this helps

Topic		Replies	Views
C4_W4_Assig1_Ex1. I am wrong, but why? Convolutional Neural Networks	2	532	February 13, 2022
C1W2 -> understand the constrastive loss function Custom Models, Layers and Loss Functions with TF week-1	2	544	September 14, 2022
Course 4 week 4: Question about triplet loss Convolutional Neural Networks	4	529	August 31, 2022
Question about line in yolo loss function Convolutional Neural Networks	10	581	August 28, 2021
Week 2. Why we use K.mean here? Custom Models, Layers and Loss Functions with TF week-2	3	488	April 23, 2023

C1W1 Lab 3 Euclidian Distance

Related topics