C5_W4_A1 Exercise 4 Encoder Layer

need help to understand right syntax (?) for Encoder Layer - am i not slicing input sequence X correctly ?
The following code does NOT work - please remove redact as appropriate.

# START CODE HERE
# moderator edit: code removed     
# END CODE HERE
    return encoder_layer_out
1 Like

You should use the functions created in the constructor.
So instead of calling MulltiHeadAttention() directly, you use “self.mha(…)” from the constructor.

1 Like

Thanks TMosh, however I am very confused about the arguments to pass to self.mha( or indeed the subsequent functions (out, ffn_output,…) in this Exercise 4.

The comments suggest : (batch_size,input_seq_len,fully_connected_dim) but the ‘Additional hints’ section suggests (to me) something more complicated.

Again, any tips appreciated.

NON-WORKING CODE BELOW***


# START CODE HERE

# moderator edit: code removed

# END CODE HERE

return encoder_layer_out

For self-attention, the ‘x’ matrix is used for each of Q, V, and K. See this from the instructions:

The instructions provide similar guidance for all the other lines of code.

If you’re going to keep posting your code, please use the “preformatted text” tag. This will prevent the code comments from being interpreted as Markup text (that’s why your post is full of bold text).

Hello TMosh, i am following the instructions as best i can, to apply the boolean mask, and leave training Default, not applied till Dropout ? but the error now is that call () missing a positional argument ‘value’
dysfunctional code below -preformatted text this time:)******

 # START CODE HERE
 # moderator edit: code removed
 # END CODE HERE
        

Please post a screen capture image showing the entire error message and the whole assert stack.

Read my previous reply again. You need to pass ‘x’ to the self.mha(…) function three times - once for each of the Q, V, and K matrices.

Thanks, but I may still be missing a syntax or format issue:

{moderator edit: code removed}

the screenshot of (bottom bit only) of error message:

That part of the assert log doesn’t help me, because I can’t see which line of your code caused the issue.

You only need self.mah(x, x, x, mask).

If you’re having a problem, it might be somewhere else.

sorry to bother, but i used exactly what you have had

# START CODE HERE
# moderator edit: code removed
# END CODE HERE

TypeError Traceback (most recent call last)
in
1 # UNIT TEST
----> 2 EncoderLayer_test(EncoderLayer)

~/work/W4A1/public_tests.py in EncoderLayer_test(target)
84 encoder_layer1 = target(4, 2, 8)
85 tf.random.set_seed(10)
—> 86 encoded = encoder_layer1(q, True, np.array([[1, 0, 1]]))
87
88 assert tf.is_tensor(encoded), “Wrong type. Output must be a tensor”

/opt/conda/lib/python3.7/site-packages/tensorflow/python/keras/engine/base_layer.py in call(self, *args, **kwargs)
1010 with autocast_variable.enable_auto_cast_variables(
1011 self._compute_dtype_object):
→ 1012 outputs = call_fn(inputs, *args, **kwargs)
1013
1014 if self._activity_regularizer:

in call(self, x, training, mask)
39 # START CODE HERE
40 # calculate self-attention using mha(~1 line). Dropout will be applied during training
—> 41 attn_output = self.mha(x,x,x,mask) # Self attention (batch_size, input_seq_len, fully_connected_dim)
42
43 # apply layer normalization on sum of the input and the attention output to get the

/opt/conda/lib/python3.7/site-packages/tensorflow/python/keras/engine/base_layer.py in call(self, *args, **kwargs)
1010 with autocast_variable.enable_auto_cast_variables(
1011 self._compute_dtype_object):
→ 1012 outputs = call_fn(inputs, *args, **kwargs)
1013
1014 if self._activity_regularizer:

/opt/conda/lib/python3.7/site-packages/tensorflow/python/keras/layers/multi_head_attention.py in call(self, query, value, key, attention_mask, return_attention_scores, training)
463 # H = size_per_head
464 # query = [B, T, N ,H]
→ 465 query = self._query_dense(query)
466
467 # key = [B, S, N, H]

/opt/conda/lib/python3.7/site-packages/tensorflow/python/keras/engine/base_layer.py in call(self, *args, **kwargs)
1006 with ops.name_scope_v2(name_scope):
1007 if not self.built:
→ 1008 self._maybe_build(inputs)
1009
1010 with autocast_variable.enable_auto_cast_variables(

/opt/conda/lib/python3.7/site-packages/tensorflow/python/keras/engine/base_layer.py in _maybe_build(self, inputs)
2708 # operations.
2709 with tf_utils.maybe_init_scope(self):
→ 2710 self.build(input_shapes) # pylint:disable=not-callable
2711 # We must set also ensure that the layer is marked as built, and the build
2712 # shape is stored since user defined build functions may not be calling

/opt/conda/lib/python3.7/site-packages/tensorflow/python/keras/layers/einsum_dense.py in build(self, input_shape)
152 constraint=self.kernel_constraint,
153 dtype=self.dtype,
→ 154 trainable=True)
155
156 if bias_shape is not None:

/opt/conda/lib/python3.7/site-packages/tensorflow/python/keras/engine/base_layer.py in add_weight(self, name, shape, dtype, initializer, regularizer, trainable, constraint, use_resource, synchronization, aggregation, **kwargs)
637 synchronization=synchronization,
638 aggregation=aggregation,
→ 639 caching_device=caching_device)
640 if regularizer is not None:
641 # TODO(fchollet): in the future, this should be handled at the

/opt/conda/lib/python3.7/site-packages/tensorflow/python/training/tracking/base.py in _add_variable_with_custom_getter(self, name, shape, dtype, initializer, getter, overwrite, **kwargs_for_getter)
808 dtype=dtype,
809 initializer=initializer,
→ 810 **kwargs_for_getter)
811
812 # If we set an initializer and the variable processed it, tracking will not

/opt/conda/lib/python3.7/site-packages/tensorflow/python/keras/engine/base_layer_utils.py in make_variable(name, shape, dtype, initializer, trainable, caching_device, validate_shape, constraint, use_resource, collections, synchronization, aggregation, partitioner)
127 # TODO(apassos,rohanj) figure out how to remove collections from here so we
128 # can remove the V1.
→ 129 variable_shape = tensor_shape.TensorShape(shape)
130 return tf_variables.VariableV1(
131 initial_value=init_val,

/opt/conda/lib/python3.7/site-packages/tensorflow/python/framework/tensor_shape.py in init(self, dims)
756 “”"
757 if isinstance(dims, (tuple, list)): # Most common case.
→ 758 self._dims = [Dimension(d) for d in dims]
759 elif dims is None:
760 self._dims = None

/opt/conda/lib/python3.7/site-packages/tensorflow/python/framework/tensor_shape.py in (.0)
756 “”"
757 if isinstance(dims, (tuple, list)): # Most common case.
→ 758 self._dims = [Dimension(d) for d in dims]
759 elif dims is None:
760 self._dims = None

/opt/conda/lib/python3.7/site-packages/tensorflow/python/framework/tensor_shape.py in init(self, value)
204 TypeError("Dimension value must be integer or None or have "
205 “an index method, got value ‘{0!r}’ with type ‘{1!r}’”
→ 206 .format(value, type(value))), None)
207 if self._value < 0:
208 raise ValueError("Dimension d must be >= 0" self._value)

/opt/conda/lib/python3.7/site-packages/six.py in raise_from(value, from_value)

TypeError: Dimension value must be integer or None or have an index method, got value ‘<main.EncoderLayer object at 0x7ff5a00c72d0>’ with type ‘<class ‘main.EncoderLayer’>’

requesting any tip or hint as to the problem here - I have tried TMosh recommended
using self.mha(x,x,x,mask) to pass x to Q, V, K as self-attention, and the Boolean mask

  • so much of these problems sets is about the implementation, not the concepts themselves :frowning:
2 Likes

TMosh or other Mentor, please help, I am twisting my head not on some conceptual issue but possibly syntax or tf.keras implementation question (sorry, but am a total novice there;-(

The last recommended hint did not work…


# START CODE HERE

# moderator edit: code removed

# END CODE HERE

I have found the error in your code.
ffn_output = self.dropout_ffn(ffn_output,training=True)
That line should use “training=training”.

When you force training=True, you will get incorrect results when the unit test turns the training to False.

1 Like