I picked up the idea of **triplet loss** and **global orthogonal regularization** from this paper http://cs230.stanford.edu/projects_fall_2019/reports/26251543.pdf. However, I keep getting caught up in an tensor shape error.

After I define `modelv1`

as the base model (`modelv1`

take input of shape `(None,224,224,3)`

and return tensor of shape `(None,64)`

), the complete model will be defined as follow:

```
input_shape=(3,224,224,3)
input_all=Input(shape=input_shape)
input_anchor=input_all[:,0,:]
input_pos=input_all[:,1,:]
input_neg=input_all[:,2,:]
output_anchor=modelv1(input_anchor)
output_pos=modelv1(input_pos)
output_neg=modelv1(input_neg)
model=Model(inputs=input_all,outputs=[output_anchor,output_pos,output_neg])
```

The formula for triplet loss with **global orthogonal regularization**, as provided in the paper I mentioned above is:

Formular for the loss function

I implemented this formular as follow:

```
def triplet_loss_with_margin(margin=0.4,d=64,alpha=1.1):
def triplet_loss(y_true,y_pred):
"""
Implementation of the triplet loss as defined by formula (3)
Arguments:
y_true -- true labels, required when you define a loss in Keras, you don't need it in this function.
y_pred -- python list containing three objects:
anchor -- the encodings for the anchor images, of shape (None, 64)
positive -- the encodings for the positive images, of shape (None, 64)
negative -- the encodings for the negative images, of shape (None, 64)
Returns:
loss -- real number, value of the loss
"""
anchor, positive, negative = y_pred[0], y_pred[1], y_pred[2]
# Step 1: Compute the (encoding) distance between the anchor and the positive
pos_dist = tf.math.reduce_sum(tf.math.square(tf.math.subtract(anchor,positive)),axis=-1)
# Step 2: Compute the (encoding) distance between the anchor and the negative
neg_dist = tf.math.reduce_sum(tf.math.square(tf.math.subtract(anchor,negative)),axis=-1)
# Step 3: subtract the two previous distances and add alpha.
basic_loss = tf.math.add(tf.math.subtract(pos_dist,neg_dist),margin)
# Step 4: Take the maximum of basic_loss and 0.0. Sum over the training examples.
loss = tf.math.reduce_sum(tf.math.maximum(basic_loss,0.0))
# add regularization term
dot_product=tf.matmul(anchor,tf.transpose(negative))
multiply_2_vectors_value=tf.linalg.diag_part(dot_product)
M1=tf.math.reduce_sum(multiply_2_vectors_value,axis=-1)
M2=tf.math.square(multiply_2_vectors_value)
M2=tf.math.maximum(tf.math.subtract(M2,1/d),0.0)
M2=tf.math.reduce_sum(M2,axis=-1)
loss+=alpha*(tf.math.square(M1)+M2)
return loss
return triplet_loss
```

I assumed that since anchor and negative all have shape (None,64), this approach should work. However, when I trained the model, I encoutered the error bellow

```
ValueError: in user code:
/opt/conda/lib/python3.7/site-packages/keras/engine/training.py:853 train_function *
return step_function(self, iterator)
/tmp/ipykernel_24/1319124991.py:34 triplet_loss *
dot_product=tf.matmul(anchor,tf.transpose(negative))
/opt/conda/lib/python3.7/site-packages/tensorflow/python/util/dispatch.py:206 wrapper **
return target(*args, **kwargs)
/opt/conda/lib/python3.7/site-packages/tensorflow/python/ops/math_ops.py:3655 matmul
a, b, transpose_a=transpose_a, transpose_b=transpose_b, name=name)
/opt/conda/lib/python3.7/site-packages/tensorflow/python/ops/gen_math_ops.py:5714 mat_mul
name=name)
/opt/conda/lib/python3.7/site-packages/tensorflow/python/framework/op_def_library.py:750 _apply_op_helper
attrs=attr_protos, op_def=op_def)
/opt/conda/lib/python3.7/site-packages/tensorflow/python/framework/func_graph.py:601 _create_op_internal
compute_device)
/opt/conda/lib/python3.7/site-packages/tensorflow/python/framework/ops.py:3569 _create_op_internal
op_def=op_def)
/opt/conda/lib/python3.7/site-packages/tensorflow/python/framework/ops.py:2042 __init__
control_input_ops, op_def)
/opt/conda/lib/python3.7/site-packages/tensorflow/python/framework/ops.py:1883 _create_c_op
raise ValueError(str(e))
ValueError: Shape must be rank 2 but is rank 1 for '{{node triplet_loss/MatMul}} = MatMul[T=DT_FLOAT, transpose_a=false, transpose_b=false](triplet_loss/strided_slice, triplet_loss/transpose)' with input shapes: [64], [64].
```

From what I understand, the error is caused because when implementing `dot_product=tf.matmul(anchor,tf.transpose(negative))`

, `anchor`

and `negative`

only has shape (64) so it caused the error. But should anchor and negative is of shape `(batch_size,64)`

? I really could not understand what I did wrong. Could you please enlighten me about this? Thank you

I tried to debug by implementing an independent funct to test

```
def triplet_loss(y_pred):
"""
Implementation of the triplet loss as defined by formula (3)
Arguments:
y_true -- true labels, required when you define a loss in Keras, you don't need it in this function.
y_pred -- python list containing three objects:
anchor -- the encodings for the anchor images, of shape (None, 64)
positive -- the encodings for the positive images, of shape (None, 64)
negative -- the encodings for the negative images, of shape (None, 64)
Returns:
loss -- real number, value of the loss
"""
anchor, positive, negative = y_pred[0], y_pred[1], y_pred[2]
# Step 1: Compute the (encoding) distance between the anchor and the positive
pos_dist = tf.math.reduce_sum(tf.math.square(tf.math.subtract(anchor,positive)),axis=-1)
# Step 2: Compute the (encoding) distance between the anchor and the negative
neg_dist = tf.math.reduce_sum(tf.math.square(tf.math.subtract(anchor,negative)),axis=-1)
# Step 3: subtract the two previous distances and add alpha.
basic_loss = tf.math.add(tf.math.subtract(pos_dist,neg_dist),0.4)
# Step 4: Take the maximum of basic_loss and 0.0. Sum over the training examples.
loss = tf.math.reduce_sum(tf.math.maximum(basic_loss,0.0))
# add regularization term
print("anchor shape: ",anchor.shape)
print("neg shape: ",negative.shape)
dot_product=tf.matmul(anchor,tf.transpose(negative))
multiply_2_vectors_value=tf.linalg.diag_part(dot_product)
M1=tf.math.reduce_sum(multiply_2_vectors_value,axis=-1)
M2=tf.math.square(multiply_2_vectors_value)
mask=tf.math.maximum(tf.math.subtract(M2,1/64),0.0)
M2=tf.math.reduce_sum(M2,axis=-1)
loss+=1.1*(tf.math.square(M1)+M2)
return loss
```

And it works fine with dummy tensor I passed to it

```
dummy=tf.random.uniform((1,3,224,224,3))
re_dum=model.predict(dummy)
test=triplet_loss(re_dum)
```

re_dum is a list of 3 elements, each is a tensor of shape (1,64),test is a number. So this little test shows that there is no problem with my implementing. But why the error keeps showing up?

Besides, when I replace

```
dot_product=tf.matmul(anchor,tf.transpose(negative))
```

with

```
dot_product=tf.matmul(tf.expand_dims(anchor,axis=0),tf.transpose(tf.expand_dims(negative,axis=0)))
```

The error disappeared, but it seems very perplexing for me why it works.