Scope of model.compile(), model.fit() and model.__init__() model.build() in custom and distributed training

Hi,
I try to figure out the scope of the different methods listed above.

My understanding is, that custom training will replace:

  • model.fit() : reducing loss, adjusting weight, calculating Gradients
  • model.compile() : defining loss, metrics and optimizer

But I am still at loss in regards of the initialization of the weights of the model.
Previously I understood that the initial weights of a model will be defined already once the model constructor model.init() is called, which is calling model.build()
(At least this is what I understood from the Course in Custom layers and models)
To do so, the model constructor already need to know the shape of the inputs.

But in the multi-GPU mirrored strategy lab, I do not see where the Input-shape of the model
is defined?
When will the model be initialized and where to specify the input shape?
And will each model (since every GPU has its own copy) have the same set of initial weights?
Can we choose, how to initialize the weights (since there are different strategies?)

Thanks! Mirko

@mirko.bruhn, if the shape isn’t known initially, weights won’t be created until they are used. The “Specifying the input shape in advance” section in the Keras Sequential model documentation has a good general explanation with examples.

1 Like