I’m wondering if anyone could help me find some resources that might explain how to train the Transformer model built in the assignment.
As soon as I finished the assignment, I wanted to see if I could use this implementation to train a transformer on a dataset of indexed input/output sentences I already had prepared.
I tried different variations of
model = Transformer(…)
optimizer = …
model.compile(…)
model.fit(…)
and invariably get the following error;
"Models passed to fit
can only have training
and the first argument in call
as positional arguments, found: [‘tar’, ‘enc_padding_mask’, ‘look_ahead_mask’, ‘dec_padding_mask’ "
In the TensorFlow documentation for tf.keras.Model, I see that the call method should use a list of inputs and a list of masks. Should I try changing the assignment code to match this structure, or am I wasting my time trying to train the model in this way?
I’m a bit frustrated that this assignment uses the APIs in ways that are completely unprecedented and unexplained in DeepMind.ai, and also doesn’t provide an example of training on some data.