Scope of model.compile(), model.fit() and model.init() model.build() in custom and distributed training

mirko.bruhn · January 24, 2023, 10:33am

Hi,
I try to figure out the scope of the different methods listed above.

My understanding is, that custom training will replace:

model.fit() : reducing loss, adjusting weight, calculating Gradients
model.compile() : defining loss, metrics and optimizer

But I am still at loss in regards of the initialization of the weights of the model.
Previously I understood that the initial weights of a model will be defined already once the model constructor model.init() is called, which is calling model.build()
(At least this is what I understood from the Course in Custom layers and models)
To do so, the model constructor already need to know the shape of the inputs.

But in the multi-GPU mirrored strategy lab, I do not see where the Input-shape of the model
is defined?
When will the model be initialized and where to specify the input shape?
And will each model (since every GPU has its own copy) have the same set of initial weights?
Can we choose, how to initialize the weights (since there are different strategies?)

Thanks! Mirko

Wendy · January 24, 2023, 10:17pm

@mirko.bruhn, if the shape isn’t known initially, weights won’t be created until they are used. The “Specifying the input shape in advance” section in the Keras Sequential model documentation has a good general explanation with examples.

Scope of model.compile(), model.fit() and model.__init__() model.build() in custom and distributed training

Scope of model.compile(), model.fit() and model.init() model.build() in custom and distributed training