Hi, i am little-bit confused here because in deep learning specialization we learned that the “W” shape will be: (n[l], n[l - 1]), but in here it is W: (input_shape[-1], n[l]). why you put “n[l]” on column side instead of row side.
Invalid input_shape
Course Q&A
TensorFlow: Advanced Techniques Specialization
Custom Models, Layers and Loss Functions with TF
The self.weight here is more focused on layer weight and not main weight were as def statement (self, input.shape) is same as DLS
in this course, as far as remember the loss function is used with gradient tape, allowing to create loss function based on trainable and non-trainable variable and the equation you are pointing difference is not the main weight matrix but the gradient weight or layer weight equation focused more on random selection of layer to be trained.
There is no universal standard for the orientation of the weight matrix. It might be either orientation, depending on the design of the model.
They are mathematically equivalent, differing only in shape by a transposition.
1 Like