Cant understand a matrix

suramya_verma · May 5, 2022, 9:56am

can you please explain me what is w1 [1] T , w2 [2]T , w3 [3] Tand w4 [4]T matrix… i am really confused

paulinpaloalto · May 5, 2022, 10:29pm

It sounds like you are asking about what is shown in this screenshot from the Week 3 lecture “Computing A Neural Network’s Output” at about 4:50 into that lecture:

What is happening there is that Prof Ng is starting with the individual equations for the output of each neuron and he uses the same format for everything that he does in the Logistic Regression case. So for example he shows:

z_1^{[1]} = w_1^{[1]T} \cdot x + b_1^{[1]}

Note that I’ve added one little “extra” there by making the dot product operation explicit between the w vector and the x vector.

So in that formulation, the vector w_1^{[1]} is the weight vector for the first neuron in the first layer (that’s the exponent [1] everywhere there). He formats w_1^{[1]} as a column vector, just as he did in the Logistic Regression case. It has dimension n_x x 1, where n_x is the number of input features (elements in each input x vector). x is also a column vector n_x x 1, so in order to get that dot product to work, we need to transpose the w vector. So dotting 1 x n_x with n_x x 1 gives you a 1 x 1 or scalar output.

Then what he does it to put all the weight vectors for the output neurons of layer 1 together into a single matrix, so that we can compute the outputs all at once in a vectorized way. But he also wants to make it simpler, so that we don’t need any more transposes on the whole W^{[1]} weight matrix. So he uses the w vectors in the transposed form, so that they are now row vectors 1 x n_x. That means he can stack them up as the rows of the weight matrix W^{[1]}. That’s what he is showing in the lower left section of that diagram.

So you end up with W^{[1]} having the dimensions n^{[1]} x n_x, where n^{[1]} is the number of output neurons in layer 1. And because of the fact that the w vectors from the upper right formulation are now the rows of W^{[1]}, the full vectorized forward propagation becomes:

Z^{[1]} = W^{[1]} \cdot X + b^{[1]}

Where X there is the full sample matrix with each column being one input vector. So if you have m samples, then Z^{[1]} is n^{[1]} x m. Then we apply the activation function “elementwise” to get A^{[1]} so it has the same dimensions as Z^{[1]}.

I didn’t mention the bias values there, but there is one scalar b value for each output neuron. In the final vectorized form, you also “stack” those into a column vector of dimension n^{[1]} x 1. So when you add that vector, it is “broadcast” and adds to each column of the output to compute the final Z^{[1]}.

Soumak_Banerjee · May 16, 2022, 8:08am

Thanks a lot for the explanation @paulinpaloalto .
I have changed the code

 W1 = np.random.rand(n_h, n_x)*0.01
    b1 = (np.zeros((n_h,1)))
where n_h = n[1] and n_x = n[0]

So, the initialization has the following values for W1 when called for the test method

[[0.00435995 0.00025926 0.00549662]
 [0.00435322 0.00420368 0.00330335]
 [0.00204649 0.00619271 0.00299655]
 [0.00266827 0.00621134 0.00529142]
 [0.0013458  0.00513578 0.0018444 ]]

with
n_x → 3
n_h – > 5
n_y —> 2
But the expected value is

W1 = [[-0.00416758 -0.00056267]
 [-0.02136196  0.01640271]
 [-0.01793436 -0.00841747]
 [ 0.00502881 -0.01245288]]

which is 2X4 array
I am getting the below error now

~/work/release/W3A1/public_tests.py in initialize_parameters_test(target)
     57     assert parameters["b2"].shape == expected_output["b2"].shape, f"Wrong shape for b2."
     58 
---> 59     assert np.allclose(parameters["W1"], expected_output["W1"]), "Wrong values for W1"
     60     assert np.allclose(parameters["b1"], expected_output["b1"]), "Wrong values for b1"
     61     assert np.allclose(parameters["W2"], expected_output["W2"]), "Wrong values for W2"

AssertionError: Wrong values for W1

Rashmi · May 16, 2022, 10:33am

Hi Soumak,

I have already replied to this query on the other thread raised by you.

Juheon_Chu · March 8, 2024, 4:31am

Thank you so much for your explanation! It sounds clear to me now!

Topic		Replies	Views
Question regarding dimensions of w in logistic regression Neural Networks and Deep Learning	3	337	October 13, 2023
Why is the Weight Matrix the transposed of NN's? Neural Networks and Deep Learning	2	842	June 16, 2021
Is the reason why we transpose a matrix, is such that we orientated for it to be dot product and produce our intended results? Neural Networks and Deep Learning	7	847	January 20, 2025
Planar data classification with one hidden layer - forward prop confusion Neural Networks and Deep Learning week-3	3	22	February 8, 2025
Week 2 - Logistic Regression Model Neural Networks and Deep Learning	8	772	February 27, 2022

Cant understand a matrix

Related topics