C4W1 UNQ_C4: Completely stuck

I am not understanding much of anything on what to do to implement this function

NMTAttn(input_vocab_size=33300,
target_vocab_size=33300,
d_model=1024,
n_encoder_layers=2,
n_decoder_layers=2,
n_attention_heads=4,
attention_dropout=0.0,
mode=‘train’)

There is no reference to input or target tokens in the function arguments, and these are not subsequently defined in the function. Are we supposed to use the variables assigned in the prior statement in Section 1.5?

input_batch, target_batch, mask_batch = next(train_batch_stream)

If so, that is bad programming. This function is not a class method, so no variables defined outside the function should be used. It is unnecessarily time consuming to scroll up and review every single line of code in the assignment to look for what should be input to the function.

Then I see a statement like this:

Step 4: prepare queries, keys, values and mask for attention.

  None('PrepareAttentionInput', None, n_out=4),

At first, I thought we would be calling prepare_attention_input(encoder_activations, decoder_activations, inputs) here. But ‘PrepareAttentionInput’ is not what I would pass in as the argument for encoder_activations, so I am not sure what to do here.

Would appreciate if someone could answer my questions and maybe refer me to where I could look up code for calls like None(‘PrepareAttentionInput’, None, n_out=4)

Hey @bmabbe81,

You are not supposed to use the variables that you defined earlier. As we will see shortly, we have all we need inside the function itself. For implementing this function, note one very important thing, i.e., we neither need the “input tokens” nor the “target tokens”. In this function, we only need to define the model architecture, and for that, we only need the attributes of the input and output, which have been passed in arguments to the function.

Now, I guess you have enough information to give this implementation one more try. If you still face any issues, feel free to let us know with the Step Numbers, so that we can discuss those steps in a greater depth.

Cheers,
Elemento