I am on C5_W4_A1_Transformer_Subclass_v1 lab and on UNQ_C3.

I am trying the following line:

attention_weights = softmax(scaled_attention_logits) * v

and I get NameError: name ‘softmax’ is not defined

I am on C5_W4_A1_Transformer_Subclass_v1 lab and on UNQ_C3.

I am trying the following line:

attention_weights = softmax(scaled_attention_logits) * v

and I get NameError: name ‘softmax’ is not defined

Here is the list of tensorflow related package import.

```
import tensorflow as tf
from tensorflow.keras.layers import Embedding, MultiHeadAttention, Dense, Input, Dropout, LayerNormalization
```

As we are not imported softmax but imported as part of tensorflow, we need to specify a full path like

```
tf.keras.activations.softmax(....
```

Of course, you can import softmax just like other keras layers, but, I’m not sure how the grader evaluates it. So, it is safer to write a full path for softmax at here.

1 Like

ok

Now when I run the following:

print(tf.keras.activations.softmax(scaled_attention_logits, axis=-1).shape)

print(v.shape)

attention_weights = tf.matmul(tf.keras.activations.softmax(scaled_attention_logits, axis=-1) , v) # (…, seq_len_q, seq_len_k)

print(attention_weights.shape)

I get:

(3, 4)

(4, 2)

(3, 2)

but the test says the final shape is incorrect; how can this be if v.shape is (4,2) ?

I assume that you are still working on UNC_C3. As you see, there are two variables that you need to return. One is “output”, and the other is “attention_weights”. Have you returned two variables correctly ?

- “attention_weights” is the outcome of softmax()
- “output” is the result of matrix multiplication of “attention_weights” and “v”.

You need two separate steps.

ok, thanks. I corrected my return variables.

I am trying to understand the following instruction as I am still getting test failures on the masked attention weights:

```
Multiply (1. - mask) by -1e9 before applying the softmax.
```

Where would I insert the (1. - mask) * -1e9 term ?

You should have a place to apply mask just before applying softmax.

First of all, posting your code in this community is not recommended. Please remove them.

-0.000000001 = -1e-09. It is not -1e09. You can use -1.0e09 as a number in your equation.

sorry about the post; i deleted it; thanks for the hint