C5_W4_Ex- 8_class Transformer(tf.keras.Model)

AssertionError: Wrong values in outd
I got error at Exercise 8, class Transformer(tf.keras.Model):

Here is my error at last line of code. Please help me find error and fix this

call self.decoder with the appropriate arguments to get the decoder output

     dec_output, attention_weights = self.decoder(tar, enc_output, training, look_ahead_mask, dec_padding_mask)

pass decoder output through a linear layer and softmax (~2 lines)

    final_output = self.final_layer(dec_output)  # (batch_size, tar_seq_len, target_vocab_size)

AssertionError Traceback (most recent call last)
—> 65 Transformer_test(Transformer)

in Transformer_test(target)
47 assert np.allclose(translation[0, 0, 0:8],
48 [[0.02664799, 0.02222014, 0.01641812, 0.02407483,
—> 49 0.04251551, 0.02240461, 0.01556584, 0.03741234]]), “Wrong values in outd”
51 keys = list(weights.keys())

AssertionError: Wrong values in outd

{mentor edit: the answer given is now obsolete}

1 Like

I have the same problem. Something is confusing: they request two Dense layers at the end, but only one is instantiated in the constructor. Also the comment in the code says “~2 lines of code”, so I added an additional Dense layer with linear activation (before the softmax):

final_output = Dense(dec_output.shape[-1], activation=‘linear’)(dec_output)
final_output = self.final_layer(final_output) # (batch_size, tar_seq_len, target_vocab_size)

of course I tried with and without and in both cases I have the same error posted in this thread.

I did submit anyways as you suggested, and yes, I passed with 87% but the error is still there

I think you mis-read the instructions. You’re not asked to add two Dense layers. The ~2 means “approximately 2” lines of code, but you really only need one.

In total, you only add three lines of code in this function, and all of them are calls to methods defined in the constructor:

  • self.encoder(…)
  • self.decoder(…)
  • self.final_layer(…)_

Thanks @TMosh. Yes, obviously I tried that before. Same result. I will check though.

I can’t think of any good reason to use two lines of code there, since you’re given the exact function you need in the constructor.

I suspect the comment about using two lines is just wrong.

In my case the issue was caused by passing the wrong first parameter to self.decoder. After correcting, issue was resolved :slight_smile:


Thanks for the explanation

Isn’t it required to pass the decoder output through a linear transformation layer before the Softmax layer, as per the architecture?

Please identify which architecture you’re discussing specifically.

A screen capture image would be helpful, along with a reference to where you found it in the course materials.

I am referring to the Transformer network architecture. Even in assignment, the figure shows linear layer followed by softmax layer.

Is this what you’re discussing? (screen capture from the assignment notebook):

Oh, I see, I missed your reference to “Exercise 8” in the thread title.

So it’s this figure you’re referring to (at the red arrow):

Technically that’s not part of the Decoder. That’s the top layer in the Transformer object.

It’s referred to here in this cell, as part of the Transformer object’s call() method.