In the part 4.3 Model representation it is written that βπ will be of dimension π ππΓπ ππ’π‘β.
Therefore, for layer1 the shape of W1 is (400, 25).
Since we have the shape of X (5000, 400) does it mean that in every node thereβll be randomly set 400 different weihgts for every 400 inputs? And this procedure will be repeted 5000 times for every node?

Have I understood correct that every training example (total number of training examples is 5000) go through this NN one by one (5000 times sequentially)? And it gives us 157 batches in every epoch?

No. We first grouped them into mini-batches. If the mini-batch size is 100, we first group the samples into 5000/100=50 mini-batches.

Then we pass the whole mini-batch through the NN, instead of passing it sample by sample, because it is computationally faster to pass the whole mini-batch at once.

I try to understand the transformation of the data during the training process of NN. If my path of thinking is correct:

The input shape X(5000, 400) transforms into (32, 400) by deviding the total number of training examples by batch size = 32 (βThe default size of a batch in Tensorflow is 32β).

Therefore every batch of X input has now the shape (32, 400).

We multiply this batch-matrix by matrix of weights with shape (400*25).

The weights stick to the model and not to the samples. Since we have one model, we have one and only one set of weights.

The weights are only initialized once. If you are training a model, then after the first mini-batch, those weights will get updated and their values get changed. Then after the second mini-batch, those weights will get changed again, and so on.

@rmwkwok thank you one more time. You 'd helped me to imagine all the training process better. Can you advice some source to read about setting the weights?

This site comes with an interactive tool. However, some of the content would be too advanced that is not even covered in the MLS. However, it should give you some feelings on the challenge about neural networkβs weight initialization. Donβt worry if you canβt understand everything, because for some concepts, you might need to finish the Deep Learning Specialization first.

In practice, we donβt set the weights ourselves by hand, we use some initialization methods (some covered in that site) to do the job.