Transformer architecture using tensor flow

I am Usha doing the ‘Deep learning specialization’ course. Currently I am in the last week of the 5th course ‘sequence model’. When doing the assignment on ‘Transformer network model’, I got the following error in the “decoder layer” even though the code is as per the instruction given.

It says ‘‘cannot compute Einsum as input #1(zero-based) was expected to be a int64 tensor but is a float tensor [Op:Einsum]’’

Is this a datatype mismatch? Any hint to overcome this issue please.

Following are the error list put out when running ‘‘Decoder_Test’’. I only understand it is data type mismatch. But I could not figure it out.

InvalidArgumentError Traceback (most recent call last)
in
1 # UNIT TEST

InvalidArgumentError: cannot compute Einsum as input #1(zero-based) was expected to be a int64 tensor but is a float tensor [Op:Einsum]
----> 2 DecoderLayer_test(DecoderLayer, create_look_ahead_mask)
Thank you
Usha

Please post a screen capture image, instead of a text copy-and-paste. Show the entire assert stack of messages.

Did your code use the tf.einsum() function?

Hi,
Following list of error displayed when running ‘decoder layer’ test code
The code does not use tf.einsum() function.
Thanks
Usha
…Error List…

InvalidArgumentError Traceback (most recent call last)
in
1 # UNIT TEST
----> 2 DecoderLayer_test(DecoderLayer, create_look_ahead_mask)

~/work/W4A1/public_tests.py in DecoderLayer_test(target, create_look_ahead_mask)
185 # Now let’s try a example with padding mask
186 padding_mask = np.array([[[1, 1, 0]]])
→ 187 out, attn_w_b1, attn_w_b2 = decoderLayerq(q, encoderq_output, True, look_ahead_mask, padding_mask)
188 assert np.allclose(out[0, 0], [0.14950314, -1.6444231, 1.0268553, 0.4680646]), “Wrong values in out when we mask the last word. Are you passing the padding_mask to the inner functions?”
189

/opt/conda/lib/python3.7/site-packages/tensorflow/python/keras/engine/base_layer.py in call(self, *args, **kwargs)
1010 with autocast_variable.enable_auto_cast_variables(
1011 self._compute_dtype_object):
→ 1012 outputs = call_fn(inputs, *args, **kwargs)
1013
1014 if self._activity_regularizer:

in call(self, x, enc_output, training, look_ahead_mask, padding_mask)
60 # Dropout will be applied during training
61 # Return attention scores as attn_weights_block2 (~1 line)
—> 62 mult_attn_out2, attn_weights_block2 = self.mha2(Q1, enc_output, padding_mask, return_attention_scores=True) # (batch_size, target_seq_len, d_model)
63
64 # apply layer normalization (layernorm2) to the sum of the attention output and the output of the first block (~1 line)

/opt/conda/lib/python3.7/site-packages/tensorflow/python/keras/engine/base_layer.py in call(self, *args, **kwargs)
1010 with autocast_variable.enable_auto_cast_variables(
1011 self._compute_dtype_object):
→ 1012 outputs = call_fn(inputs, *args, **kwargs)
1013
1014 if self._activity_regularizer:

/opt/conda/lib/python3.7/site-packages/tensorflow/python/keras/layers/multi_head_attention.py in call(self, query, value, key, attention_mask, return_attention_scores, training)
466
467 # key = [B, S, N, H]
→ 468 key = self._key_dense(key)
469
470 # value = [B, S, N, H]

/opt/conda/lib/python3.7/site-packages/tensorflow/python/keras/engine/base_layer.py in call(self, *args, **kwargs)
1010 with autocast_variable.enable_auto_cast_variables(
1011 self._compute_dtype_object):
→ 1012 outputs = call_fn(inputs, *args, **kwargs)
1013
1014 if self._activity_regularizer:

/opt/conda/lib/python3.7/site-packages/tensorflow/python/keras/layers/einsum_dense.py in call(self, inputs)
199
200 def call(self, inputs):
→ 201 ret = special_math_ops.einsum(self.equation, inputs, self.kernel)
202 if self.bias is not None:
203 ret += self.bias

/opt/conda/lib/python3.7/site-packages/tensorflow/python/util/dispatch.py in wrapper(*args, **kwargs)
199 “”“Call target, and fall back on dispatchers if there is a TypeError.”“”
200 try:
→ 201 return target(*args, **kwargs)
202 except (TypeError, ValueError):
203 # Note: convert_to_eager_tensor currently raises a ValueError, not a

/opt/conda/lib/python3.7/site-packages/tensorflow/python/ops/special_math_ops.py in einsum(equation, *inputs, **kwargs)
749 - number of inputs or their shapes are inconsistent with equation.
750 “”"
→ 751 return _einsum_v2(equation, *inputs, **kwargs)
752
753

/opt/conda/lib/python3.7/site-packages/tensorflow/python/ops/special_math_ops.py in _einsum_v2(equation, *inputs, **kwargs)
1178 if ellipsis_label:
1179 resolved_equation = resolved_equation.replace(ellipsis_label, ‘…’)
→ 1180 return gen_linalg_ops.einsum(inputs, resolved_equation)
1181
1182 # Send fully specified shapes to opt_einsum, since it cannot handle unknown

/opt/conda/lib/python3.7/site-packages/tensorflow/python/ops/gen_linalg_ops.py in einsum(inputs, equation, name)
1074 return _result
1075 except _core._NotOkStatusException as e:
→ 1076 _ops.raise_from_not_ok_status(e, name)
1077 except _core._FallbackException:
1078 pass

/opt/conda/lib/python3.7/site-packages/tensorflow/python/framework/ops.py in raise_from_not_ok_status(e, name)
6860 message = e.message + (" name: " + name if name is not None else “”)
6861 # pylint: disable=protected-access
→ 6862 six.raise_from(core._status_to_exception(e.code, message), None)
6863 # pylint: enable=protected-access
6864

/opt/conda/lib/python3.7/site-packages/six.py in raise_from(value, from_value)

InvalidArgumentError: cannot compute Einsum as input #1(zero-based) was expected to be a int64 tensor but is a float tensor [Op:Einsum]

Sorry you asked me to take the screen shot? I will do it right away.

Hi,
Here are the screen shots of the error when running decode layer.

Capture5




Thanks
Usha

I have attached only the screen shot of the error list. Let me know if I need to send the code as well? If so how to post it privately to get support?
Usha

Hi Tom,

The decoder layer code is given below:

Capture5.PNG

Capture2.PNG

1 Like

I don’t have access to the course materials right now, but your call to mha2 maybe wrong. You need to provide the query, key, and value arguments. But you only did two of them.

Thank ypu Tom. It worked. I passed encoder output twice for K and V parmeter.

Although I passed the test by following all the instructions, I am still to get more clarity on Transformer network. It is very
Vague.

Thanks once again
Usha

Usha.