Transformer architecture using tensor flow

usha2609 · September 21, 2022, 5:25am

I am Usha doing the ‘Deep learning specialization’ course. Currently I am in the last week of the 5th course ‘sequence model’. When doing the assignment on ‘Transformer network model’, I got the following error in the “decoder layer” even though the code is as per the instruction given.

It says ‘‘cannot compute Einsum as input #1(zero-based) was expected to be a int64 tensor but is a float tensor [Op:Einsum]’’

Is this a datatype mismatch? Any hint to overcome this issue please.

Following are the error list put out when running ‘‘Decoder_Test’’. I only understand it is data type mismatch. But I could not figure it out.
…
InvalidArgumentError Traceback (most recent call last)
in
1 # UNIT TEST

InvalidArgumentError: cannot compute Einsum as input #1(zero-based) was expected to be a int64 tensor but is a float tensor [Op:Einsum]
----> 2 DecoderLayer_test(DecoderLayer, create_look_ahead_mask)
Thank you
Usha

TMosh · September 21, 2022, 6:17am

Please post a screen capture image, instead of a text copy-and-paste. Show the entire assert stack of messages.

Did your code use the tf.einsum() function?

usha2609 · September 21, 2022, 9:59am

Hi,
Following list of error displayed when running ‘decoder layer’ test code
The code does not use tf.einsum() function.
Thanks
Usha
…Error List…

InvalidArgumentError Traceback (most recent call last)
in
1 # UNIT TEST
----> 2 DecoderLayer_test(DecoderLayer, create_look_ahead_mask)

~/work/W4A1/public_tests.py in DecoderLayer_test(target, create_look_ahead_mask)
185 # Now let’s try a example with padding mask
186 padding_mask = np.array([[[1, 1, 0]]])
→ 187 out, attn_w_b1, attn_w_b2 = decoderLayerq(q, encoderq_output, True, look_ahead_mask, padding_mask)
188 assert np.allclose(out[0, 0], [0.14950314, -1.6444231, 1.0268553, 0.4680646]), “Wrong values in out when we mask the last word. Are you passing the padding_mask to the inner functions?”
189

/opt/conda/lib/python3.7/site-packages/tensorflow/python/keras/engine/base_layer.py in call(self, *args, **kwargs)
1010 with autocast_variable.enable_auto_cast_variables(
1011 self._compute_dtype_object):
→ 1012 outputs = call_fn(inputs, *args, **kwargs)
1013
1014 if self._activity_regularizer:

in call(self, x, enc_output, training, look_ahead_mask, padding_mask)
60 # Dropout will be applied during training
61 # Return attention scores as attn_weights_block2 (~1 line)
—> 62 mult_attn_out2, attn_weights_block2 = self.mha2(Q1, enc_output, padding_mask, return_attention_scores=True) # (batch_size, target_seq_len, d_model)
63
64 # apply layer normalization (layernorm2) to the sum of the attention output and the output of the first block (~1 line)

/opt/conda/lib/python3.7/site-packages/tensorflow/python/keras/engine/base_layer.py in call(self, *args, **kwargs)
1010 with autocast_variable.enable_auto_cast_variables(
1011 self._compute_dtype_object):
→ 1012 outputs = call_fn(inputs, *args, **kwargs)
1013
1014 if self._activity_regularizer:

/opt/conda/lib/python3.7/site-packages/tensorflow/python/keras/layers/multi_head_attention.py in call(self, query, value, key, attention_mask, return_attention_scores, training)
466
467 # key = [B, S, N, H]
→ 468 key = self._key_dense(key)
469
470 # value = [B, S, N, H]

/opt/conda/lib/python3.7/site-packages/tensorflow/python/keras/engine/base_layer.py in call(self, *args, **kwargs)
1010 with autocast_variable.enable_auto_cast_variables(
1011 self._compute_dtype_object):
→ 1012 outputs = call_fn(inputs, *args, **kwargs)
1013
1014 if self._activity_regularizer:

/opt/conda/lib/python3.7/site-packages/tensorflow/python/keras/layers/einsum_dense.py in call(self, inputs)
199
200 def call(self, inputs):
→ 201 ret = special_math_ops.einsum(self.equation, inputs, self.kernel)
202 if self.bias is not None:
203 ret += self.bias

/opt/conda/lib/python3.7/site-packages/tensorflow/python/util/dispatch.py in wrapper(*args, **kwargs)
199 “”“Call target, and fall back on dispatchers if there is a TypeError.”“”
200 try:
→ 201 return target(*args, **kwargs)
202 except (TypeError, ValueError):
203 # Note: convert_to_eager_tensor currently raises a ValueError, not a

/opt/conda/lib/python3.7/site-packages/tensorflow/python/ops/special_math_ops.py in einsum(equation, *inputs, **kwargs)
749 - number of inputs or their shapes are inconsistent with equation.
750 “”"
→ 751 return _einsum_v2(equation, *inputs, **kwargs)
752
753

/opt/conda/lib/python3.7/site-packages/tensorflow/python/ops/special_math_ops.py in _einsum_v2(equation, *inputs, **kwargs)
1178 if ellipsis_label:
1179 resolved_equation = resolved_equation.replace(ellipsis_label, ‘…’)
→ 1180 return gen_linalg_ops.einsum(inputs, resolved_equation)
1181
1182 # Send fully specified shapes to opt_einsum, since it cannot handle unknown

/opt/conda/lib/python3.7/site-packages/tensorflow/python/ops/gen_linalg_ops.py in einsum(inputs, equation, name)
1074 return _result
1075 except _core._NotOkStatusException as e:
→ 1076 _ops.raise_from_not_ok_status(e, name)
1077 except _core._FallbackException:
1078 pass

/opt/conda/lib/python3.7/site-packages/tensorflow/python/framework/ops.py in raise_from_not_ok_status(e, name)
6860 message = e.message + (" name: " + name if name is not None else “”)
6861 # pylint: disable=protected-access
→ 6862 six.raise_from(core._status_to_exception(e.code, message), None)
6863 # pylint: enable=protected-access
6864

/opt/conda/lib/python3.7/site-packages/six.py in raise_from(value, from_value)

InvalidArgumentError: cannot compute Einsum as input #1(zero-based) was expected to be a int64 tensor but is a float tensor [Op:Einsum]

usha2609 · September 21, 2022, 10:00am

Sorry you asked me to take the screen shot? I will do it right away.

usha2609 · September 21, 2022, 10:26am

Hi,
Here are the screen shots of the error when running decode layer.

Capture5

Thanks
Usha

usha2609 · September 21, 2022, 10:28am

I have attached only the screen shot of the error list. Let me know if I need to send the code as well? If so how to post it privately to get support?
Usha

usha2609 · September 23, 2022, 9:59am

Hi Tom,

The decoder layer code is given below:

TMosh · September 23, 2022, 4:13pm

I don’t have access to the course materials right now, but your call to mha2 maybe wrong. You need to provide the query, key, and value arguments. But you only did two of them.

usha2609 · September 24, 2022, 6:16am

Thank ypu Tom. It worked. I passed encoder output twice for K and V parmeter.

Although I passed the test by following all the instructions, I am still to get more clarity on Transformer network. It is very
Vague.

Thanks once again
Usha

Usha.

Topic		Replies	Views
Transformer Architecture -Assignment Sequence Models	5	545	September 30, 2022
Week 4 Assignment 1 Transformers Architecture with TensorFlow Exercise 8 Transformer Sequence Models	7	1058	July 13, 2021
C4W1: Exercise 4 - Translator NLP with Attention Models week-1	2	69	July 18, 2024
Test crash when jointly testing translator, endcoder, decoder NLP with Attention Models general	6	61	July 25, 2024
C4W1_Assignment Exercise 4 - Translator NLP with Attention Models week-1	5	481	January 25, 2024

Transformer architecture using tensor flow

Related topics