Is the reason why we transpose a matrix, is such that we orientated for it to be dot product and produce our intended results?

paulinpaloalto · December 19, 2022, 12:40am

How we define the data is all just choices. There is no intrinsic reason why the samples are the columns of X rather than the rows. It is just a choice the Prof Ng has made. If you took the original Stanford Machine Learning course, he did it differently there. Of course lots of consequences follow from this choice.

People often ask why we have to transpose the weight vector w in Logistic Regression:

z = w^T \cdot x + b

Whereas when we get to full Neural Networks in Week 3, we no longer need to transpose W:

z = W \cdot x + b

The answer is that these are also choices that Prof Ng has made: he uses the convention that any standalone vector is a column vector. That applies to both w the weight vector and x the sample vector, so we need to transpose w in order for the dot product to work.

But when he defines the W matrices for neural networks, he chooses to stack the weights for each neuron as a row of W and then we don’t need the transpose.

Topic		Replies	Views
Why is the Weight Matrix the transposed of NN's? Neural Networks and Deep Learning coursera-platform	2	961	June 16, 2021
Matrix multiplication lecture clarification - NN - Why do we transpose at all Advanced Learning Algorithms week-module-1	1	366	September 17, 2023
C1_General Question_Dimensions of W_ from week 2_to_ week 4 Neural Networks and Deep Learning coursera-platform	3	535	October 28, 2022
Ambiguity regarding weight matrix in Graded Quiz - Week 3 Neural Networks and Deep Learning coursera-platform	4	570	November 9, 2023
Doubt regarding the need for the Transpose Neural Networks and Deep Learning coursera-platform	3	711	May 16, 2021

Is the reason why we transpose a matrix, is such that we orientated for it to be dot product and produce our intended results?

Related topics