As in the above image, we use linear equation w1,1 * x1 + w2,1 * x2 + b1 to get z1. I am confused about w1,2 and w2,2 for z2. Are their values same as w1,1 and w2,1?
When I asked about this to Chatgpt, it replied that they are not the same. W matrix use the shape of (n_neurons, n_features). Therefore, in W = [[w1,1 w1,2][w2,1 w2,2]], w1,1 and w1,2 are not the same. w1,1 is for z1 and has different value than w1,2 which is for z2. But in the course video, formula to update w1,2 is w1,2 = w1,1 + lr * x1 * w1 * a1 * (1-a1) * (y-y^).
From my understanding of your question, the weights form a matrix where each element is unique, as such π€_{1,1} and π€_{1,2} are distinct. The updating rule for each weight follows the same general formula. For example, for π€_{1,1}, the update rule in gradient descent is given by:
So the formulas you are looking at incorporate the partial derivatives with respect to the actual weight. So, the weights are different. Itβs just that itβs been solved already and everything is just given in plain equation form.
The notation used in this course is not consistent with other DLAI courses.
There is a matrix of weights that connect each adjacent pairs of layers.
Usually these would be noted as W1 (for the input-to-hidden matrix), and W2 (for the hidden-to-output matrix).
The size of each matrix can be either (outputs x inputs), or (inputs x outputs), depending on the local convention used in that assignment. The DLAI courses use either method - sometimes within the same assignment.
So βW1β in this example might be size (3 x 2), and W2 would be size (3 x 1).
If you venture into writing down multiple subscripts for each individual weight, you can rapidly become confused.