Regarding e^{<t,t'>}'s in the figure below, are these values and their corresponding \alpha^{<t,t'>}'s scalers?
Yes they are scalars. For an in-depth understanding, please repeat the Neural Machine Translation assignment. Here we have
repeator = RepeatVector(Tx)
concatenator = Concatenate(axis=-1)
densor1 = Dense(10, activation = "tanh")
densor2 = Dense(1, activation = "relu")
activator = Activation(softmax, name='attention_weights') # We are using a custom softmax(axis = 1) loaded in this notebook
dotor = Dot(axes = 1)
and
### START CODE HERE ###
# Use repeator to repeat s_prev to be of shape (m, Tx, n_s) so that you can concatenate it with all hidden states "a" (≈ 1 line)
s_prev = None
# Use concatenator to concatenate a and s_prev on the last axis (≈ 1 line)
# For grading purposes, please list 'a' first and 's_prev' second, in this order.
concat = None
# Use densor1 to propagate concat through a small fully-connected neural network to compute the "intermediate energies" variable e. (≈1 lines)
e = None
# Use densor2 to propagate e through a small fully-connected neural network to compute the "energies" variable energies. (≈1 lines)
energies = None
# Use "activator" on "energies" to compute the attention weights "alphas" (≈ 1 line)
alphas = None
# Use dotor together with "alphas" and "a", in this order, to compute the context vector to be given to the next (post-attention) LSTM-cell (≈ 1 line)
context = None
### END CODE HERE ###
You see that we first use 10 units, then 1 unit.
1 Like
For some time I couldn’t understand why we needed densor2 but now it makes sense. It combines all of the outputs from densor1 and passes it through ReLU to get a single scalar output unit.
1 Like
Same here, had to write down the details a couple of times to notice what was going on.
As I understand, concatenation step yields (m, Tx, n_s+2*n_a) → this propagates through Densor1 to yield (m, Tx, 10) → which then propagates through Densor2 to yield (m, Tx, 1), i.e. one scalar weight (/energy) value associated with each ‘t’.
We then apply softmax to get weights adding to 1.
1 Like
