Week 4 Assignment 1 Transformers Architecture with TensorFlow Exercise 8 Transformer

I encounter error in the last part of this exercise “InvalidArgumentError: cannot compute Einsum as input #1(zero-based) was expected to be a int64 tensor but is a float tensor [Op:Einsum]”.

I searched this and it seems to be related to dtype. However, all my tests passed and I think I cast dtype to float32 when needed correctly. I do not understand what causes this error. Please help! Thanks so much.

Below is my implementation:
{mentor edit: code removed}

Please don’t post your code on the forum.

In your self.decoder() arguments, you need to use enc_output instead of input_sentence.
In self.final_layer(), you need to use dec_output.


Im stuck in exercise 4 .
how do i perform multihead attention using self.mha.
I cant wrap ,y head around that line of code.
please help

1 Like

Since this is self-attention, you use ‘x’ for all three matrices. The fourth argument is the ‘mask’.


thank you very much for your help.
I have another issue with the programming homework.
Whenever I try to compute the code for the decoder model, I keep getting an error during the decoder test. I have cross-checked my code and didn’t find any errors.
this is the error I get: ‘Wrong values in outd’ that is the error for



I would appreciate a little help with this.
kind regards

The message tells you there is an error in your code for computing the “outd” value.

That comes from the ‘x’ variable returned by Decoder().

I have tried to update the code from start and crosscheck all through. At this point i am stuck

Be sure you don’t hard-code the “training = …” argument to a specific value.