Parameters in Neural Network

quackqauck · September 29, 2024, 3:19pm

Hello everyone,
I am confused about the weights and bias in neural network.

As in the above image, we use linear equation w1,1 * x1 + w2,1 * x2 + b1 to get z1. I am confused about w1,2 and w2,2 for z2. Are their values same as w1,1 and w2,1?

When I asked about this to Chatgpt, it replied that they are not the same. W matrix use the shape of (n_neurons, n_features). Therefore, in W = [[w1,1 w1,2][w2,1 w2,2]], w1,1 and w1,2 are not the same. w1,1 is for z1 and has different value than w1,2 which is for z2. But in the course video, formula to update w1,2 is w1,2 = w1,1 + lr * x1 * w1 * a1 * (1-a1) * (y-y^).

I am so confused about this. Could you please explain me about this?

lukmanaj · September 29, 2024, 4:14pm

From my understanding of your question, the weights form a matrix where each element is unique, as such 𝑤_{1,1} and 𝑤_{1,2} are distinct. The updating rule for each weight follows the same general formula. For example, for 𝑤_{1,1}, the update rule in gradient descent is given by:

w_{1,1} = w_{1,1} - \text{learning_rate} \times \frac{\partial \text{Loss}}{\partial w_{1,1}}

So the formulas you are looking at incorporate the partial derivatives with respect to the actual weight. So, the weights are different. It’s just that it’s been solved already and everything is just given in plain equation form.

TMosh · September 29, 2024, 4:31pm

The notation used in this course is not consistent with other DLAI courses.

There is a matrix of weights that connect each adjacent pairs of layers.
Usually these would be noted as W1 (for the input-to-hidden matrix), and W2 (for the hidden-to-output matrix).

The size of each matrix can be either (outputs x inputs), or (inputs x outputs), depending on the local convention used in that assignment. The DLAI courses use either method - sometimes within the same assignment.

So “W1” in this example might be size (3 x 2), and W2 would be size (3 x 1).

If you venture into writing down multiple subscripts for each individual weight, you can rapidly become confused.

Topic		Replies	Views
Week2 - Derivation for Update function for w(i+1) Neural Networks and Deep Learning week-2 , coursera-platform	8	228	January 21, 2024
W2_A1_Calculating gradient descent with variables Dw and db Neural Networks and Deep Learning coursera-platform	5	1026	December 8, 2023
Notation Help Improving Deep Neural Networks: Hyperparameter tun coursera-platform	3	447	August 1, 2023
[DLS 1] Week 3 Exercise 3 - Weights Matrix shape Neural Networks and Deep Learning coursera-platform	2	466	August 6, 2023
Week 3 update_parameters, how to compute partial derivative J Neural Networks and Deep Learning coursera-platform	1	710	July 5, 2021

Parameters in Neural Network

Related topics