C2_W2_Assignment: weights in 4.3 Model representation

Hello!

In the part 4.3 Model representation it is written that β€˜π‘Š will be of dimension π‘ π‘–π‘›Γ—π‘ π‘œπ‘’π‘‘β€™.
Therefore, for layer1 the shape of W1 is (400, 25).
Since we have the shape of X (5000, 400) does it mean that in every node there’ll be randomly set 400 different weihgts for every 400 inputs? And this procedure will be repeted 5000 times for every node?

Hello @VeronikaS,

Please follow this flow:

  1. X has a shape of (5000, 400)
  2. each neuron in the layer that accepts X has a shape of (400, 1) for its weights
  3. If the layer has 25 neurons, the whole layer has a shape of (400, 25) for its weights, called W

I think your β€œnode” means my β€œneuron”.

In point number 2, each neuron will randomly set 400 weights, as you said, and to have 25 neurons, we will randomly set 400 x 25 weights.

We do NOT repeat weights creation for sample (5000), but we do it for neurons (25).

(Note we do not actually repeat weight creation. we do it all at once. we create all 400 x 25 weights at once)

Cheers,
Raymond

1 Like

@rmwkwok thank you a lot for help!

Have I understood correct that every training example (total number of training examples is 5000) go through this NN one by one (5000 times sequentially)? And it gives us 157 batches in every epoch?

No. We first grouped them into mini-batches. If the mini-batch size is 100, we first group the samples into 5000/100=50 mini-batches.

Then we pass the whole mini-batch through the NN, instead of passing it sample by sample, because it is computationally faster to pass the whole mini-batch at once.

1 Like

I try to understand the transformation of the data during the training process of NN. If my path of thinking is correct:

  1. The input shape X(5000, 400) transforms into (32, 400) by deviding the total number of training examples by batch size = 32 (β€˜The default size of a batch in Tensorflow is 32’).

Therefore every batch of X input has now the shape (32, 400).

  1. We multiply this batch-matrix by matrix of weights with shape (400*25).

  2. We repeat step 2 for every batch?

@VeronikaS

Yes, you have captured the essential idea behind!

Cheers,
Raymond

1 Like

@rmwkwok
as you’ve mentioned above:

Will the weights set for the first batch of the data the same for following batches?

@VeronikaS

The weights stick to the model and not to the samples. Since we have one model, we have one and only one set of weights.

The weights are only initialized once. If you are training a model, then after the first mini-batch, those weights will get updated and their values get changed. Then after the second mini-batch, those weights will get changed again, and so on.

Raymond

1 Like

@rmwkwok thank you one more time. You 'd helped me to imagine all the training process better. Can you advice some source to read about setting the weights?

This site comes with an interactive tool. However, some of the content would be too advanced that is not even covered in the MLS. However, it should give you some feelings on the challenge about neural network’s weight initialization. Don’t worry if you can’t understand everything, because for some concepts, you might need to finish the Deep Learning Specialization first.

In practice, we don’t set the weights ourselves by hand, we use some initialization methods (some covered in that site) to do the job.

Raymond