Question about UNQ_C3 Mask

Hello,
I am struggling to implement the UNQ_C3 mask.

I am confused as to what shape it should take? Should it be applied both for the encoder and decoder?

I am very lost.

Hi Uzay,

Apologies for the belated reply.

In case you have not resolved this issue yet, look at the hint. If inputs are positive for real tokens and 0 when padding, which boolean operation could you use to set a mask to 1 for every element of the input with a real token and to 0 for every padding token?

The mask should be applied to the entire model, to make sure that only real tokens affect outcomes.