Parameters in Neural Network

Hello everyone,
I am confused about the weights and bias in neural network.


As in the above image, we use linear equation w1,1 * x1 + w2,1 * x2 + b1 to get z1. I am confused about w1,2 and w2,2 for z2. Are their values same as w1,1 and w2,1?

When I asked about this to Chatgpt, it replied that they are not the same. W matrix use the shape of (n_neurons, n_features). Therefore, in W = [[w1,1 w1,2][w2,1 w2,2]], w1,1 and w1,2 are not the same. w1,1 is for z1 and has different value than w1,2 which is for z2. But in the course video, formula to update w1,2 is w1,2 = w1,1 + lr * x1 * w1 * a1 * (1-a1) * (y-y^).


I am so confused about this. Could you please explain me about this?

From my understanding of your question, the weights form a matrix where each element is unique, as such 𝑀_{1,1} and 𝑀_{1,2} are distinct. The updating rule for each weight follows the same general formula. For example, for 𝑀_{1,1}, the update rule in gradient descent is given by:

w_{1,1} = w_{1,1} - \text{learning_rate} \times \frac{\partial \text{Loss}}{\partial w_{1,1}}

So the formulas you are looking at incorporate the partial derivatives with respect to the actual weight. So, the weights are different. It’s just that it’s been solved already and everything is just given in plain equation form.

1 Like

The notation used in this course is not consistent with other DLAI courses.

There is a matrix of weights that connect each adjacent pairs of layers.
Usually these would be noted as W1 (for the input-to-hidden matrix), and W2 (for the hidden-to-output matrix).

The size of each matrix can be either (outputs x inputs), or (inputs x outputs), depending on the local convention used in that assignment. The DLAI courses use either method - sometimes within the same assignment.

So β€œW1” in this example might be size (3 x 2), and W2 would be size (3 x 1).

If you venture into writing down multiple subscripts for each individual weight, you can rapidly become confused.

2 Likes