C5 Week 4 A1 Exercise 4

I am unable to understand why it is not changing its shape.

{mentor edit: code removed}

tf.Tensor(
[[[[ 0.11490856 0.4693238 -0.6517217 0.13027893]
[ 0. 0.48687497 -0.6237864 0.18140751]
[ 0.10580825 0.4657544 -0.66466635 0.11691388]]]], shape=(1, 1, 3, 4), dtype=float32)
tf.Tensor(
[[[[ 0.11490856 0.4693238 -0.6517217 0.13027893]
[ 0. 0.48687497 -0.6237864 0.18140751]
[ 0.10580825 0.4657544 -0.66466635 0.11691388]]]], shape=(1, 1, 3, 4), dtype=float32)
tf.Tensor(
[[[[-0.19707678 -1.0017939 -0.45860898 1.6574798 ]
[-1.218006 1.3352634 -0.6613954 0.54413795]
[ 0.5050506 0.5882081 -1.7301245 0.63686574]]]], shape=(1, 1, 3, 4), dtype=float32)

AssertionError Traceback (most recent call last)
in
17
18
—> 19 EncoderLayer_test(EncoderLayer)

in EncoderLayer_test(target)
7
8 assert tf.is_tensor(encoded), “Wrong type. Output must be a tensor”
----> 9 assert tuple(tf.shape(encoded).numpy()) == (1, q.shape[1], q.shape[2]), f"Wrong shape. We expected ((1, {q.shape[1]}, {q.shape[2]}))"
10
11 assert np.allclose(encoded.numpy(),

AssertionError: Wrong shape. We expected ((1, 3, 4))

Try this:
In “ffn_output = …”, for dropout2(), do not use “training=training”.

still no luck same error

I think your second print() statement is using the wrong variable name.

{mentor edit: code removed}

UNIT TEST

def EncoderLayer_test(target):
q = np.array([[[1, 0, 1, 1], [0, 1, 1, 1], [1, 0, 0, 1]]]).astype(np.float32)
encoder_layer1 = target(4, 2, 8)
tf.random.set_seed(10)
encoded = encoder_layer1(q, True, np.array([[1, 0, 1]]))

assert tf.is_tensor(encoded), "Wrong type. Output must be a tensor"
assert tuple(tf.shape(encoded).numpy()) == (1, q.shape[1], q.shape[2]), f"Wrong shape. We expected ((1, {q.shape[1]}, {q.shape[2]}))"

assert np.allclose(encoded.numpy(), 
                   [[-0.5214877 , -1.001476  , -0.12321664,  1.6461804 ],
                   [-1.3114998 ,  1.2167752 , -0.5830886 ,  0.6778133 ],
                   [ 0.25485858,  0.3776546 , -1.6564771 ,  1.023964  ]],), "Wrong values"

print("\033[92mAll tests passed")

EncoderLayer_test(EncoderLayer)

tf.Tensor(
[[[[ 0.26296833 0.5438655 -0.4769561 0.43180233]
[ 0. 0.55163157 -0.47251672 0.44105402]
[ 0.26371577 0.53527516 -0.46818826 0.440089 ]]]], shape=(1, 1, 3, 4), dtype=float32)
tf.Tensor(
[[[[ 0.7840514 -0.9639455 -1.0145588 1.1944535 ]
[-1.3642251 1.0410855 -0.5465303 0.86967015]
[ 0.76012594 -0.20960003 -1.545446 0.9949202 ]]]], shape=(1, 1, 3, 4), dtype=float32)
tf.Tensor(
[[[[-1.3002528e+00 0.0000000e+00 8.6987352e-01 3.1139547e-01]
[ 0.0000000e+00 4.3286264e-01 -0.0000000e+00 -7.4604503e-04]
[-7.4690622e-01 3.2295811e-01 -0.0000000e+00 -3.5450643e-01]]]], shape=(1, 1, 3, 4), dtype=float32)
tf.Tensor(
[[[[-0.5214876 -1.0014758 -0.1232168 1.6461803 ]
[-1.3115 1.2167752 -0.58308864 0.67781335]
[ 0.25485852 0.37765458 -1.656477 1.0239639 ]]]], shape=(1, 1, 3, 4), dtype=float32)

AssertionError Traceback (most recent call last)
in
17
18
—> 19 EncoderLayer_test(EncoderLayer)

in EncoderLayer_test(target)
7
8 assert tf.is_tensor(encoded), “Wrong type. Output must be a tensor”
----> 9 assert tuple(tf.shape(encoded).numpy()) == (1, q.shape[1], q.shape[2]), f"Wrong shape. We expected ((1, {q.shape[1]}, {q.shape[2]}))"
10
11 assert np.allclose(encoded.numpy(),

AssertionError: Wrong shape. We expected ((1, 3, 4))

I am using print() statements to debug the code

The problem is that all your tensors are four dimensional. They should be three dimensional.
So where the shape should be (1, 3, 4), yours are (1,1,3,4).
I do not know how that would happen.
Maybe there is another error in one of the functions that EncoderLayer() uses.
Or, did you modify some other code in the function that you should not have?

Will you send me a fresh notebook for transformer subclass? I will copy my code there and see if its working.
This would be very helpful.

please check my EncoderLayer() function

UNQ_C4 (UNIQUE CELL IDENTIFIER, DO NOT EDIT)

GRADED FUNCTION EncoderLayer

class EncoderLayer(tf.keras.layers.Layer):
“”"
The encoder layer is composed by a multi-head self-attention mechanism,
followed by a simple, positionwise fully connected feed-forward network.
This archirecture includes a residual connection around each of the two
sub-layers, followed by layer normalization.
“”"
def init(self, embedding_dim, num_heads, fully_connected_dim, dropout_rate=0.1, layernorm_eps=1e-6):
super(EncoderLayer, self).init()

    self.mha = MultiHeadAttention(num_heads=num_heads,
                                  key_dim=embedding_dim)

    self.ffn = FullyConnected(embedding_dim=embedding_dim,
                              fully_connected_dim=fully_connected_dim)

    self.layernorm1 = LayerNormalization(epsilon=layernorm_eps)
    self.layernorm2 = LayerNormalization(epsilon=layernorm_eps)

    self.dropout1 = Dropout(dropout_rate)
    self.dropout2 = Dropout(dropout_rate)

Hey TMosh thankyou for your help. It was bugging me from last night. I have solved the issue now.

There was a comma right after this line
self_attn_output = self.mha(x, x, x, mask),<----
which I overlooked. This should be just
self_attn_output = self.mha(x, x, x, mask)

2 Likes

Thanks for your report on the error.

Thanks for this @gupta4661
You have saved my life

1 Like

can you please tell me why you’ve put three x’s in self.mha(x, x, x, mask)