Softmax not defined

I am on C5_W4_A1_Transformer_Subclass_v1 lab and on UNQ_C3.

I am trying the following line:

attention_weights = softmax(scaled_attention_logits) * v

and I get NameError: name ‘softmax’ is not defined

Here is the list of tensorflow related package import.

import tensorflow as tf
from tensorflow.keras.layers import Embedding, MultiHeadAttention, Dense, Input, Dropout, LayerNormalization

As we are not imported softmax but imported as part of tensorflow, we need to specify a full path like


Of course, you can import softmax just like other keras layers, but, I’m not sure how the grader evaluates it. So, it is safer to write a full path for softmax at here.

1 Like


Now when I run the following:

print(tf.keras.activations.softmax(scaled_attention_logits, axis=-1).shape)
attention_weights = tf.matmul(tf.keras.activations.softmax(scaled_attention_logits, axis=-1) , v) # (…, seq_len_q, seq_len_k)

I get:

(3, 4)
(4, 2)
(3, 2)

but the test says the final shape is incorrect; how can this be if v.shape is (4,2) ?

I assume that you are still working on UNC_C3. As you see, there are two variables that you need to return. One is “output”, and the other is “attention_weights”. Have you returned two variables correctly ?

  1. “attention_weights” is the outcome of softmax()
  2. “output” is the result of matrix multiplication of “attention_weights” and “v”.

You need two separate steps.

ok, thanks. I corrected my return variables.

I am trying to understand the following instruction as I am still getting test failures on the masked attention weights:

Multiply (1. - mask) by -1e9 before applying the softmax.

Where would I insert the (1. - mask) * -1e9 term ?

You should have a place to apply mask just before applying softmax. :slight_smile:

First of all, posting your code in this community is not recommended. Please remove them.

-0.000000001 = -1e-09. It is not -1e09. You can use -1.0e09 as a number in your equation.

sorry about the post; i deleted it; thanks for the hint