C5 W3 A1 NMT with attention v4


When creating the one_step_attention function, we made use of the following global variables:
repeator = RepeatVector(Tx)
concatenator = Concatenate(axis=-1)
densor1 = Dense(10, activation = “tanh”)
densor2 = Dense(1, activation = “relu”)
activator = Activation(softmax, name=‘attention_weights’) # We are using a custom softmax(axis = 1) loaded in this notebook
dotor = Dot(axes = 1)

For educational purpose, how did you come up with the densors and their specs?

I’m going to guess that’s from the original Attention paper. And a heavy dose of experimentation and experience was involved in the design by the authors of the paper.