Transpose of the weight matrix

@dave_merit: The definition of the w weight vector in Logistic Regression and the W weight matrices in Neural Networks is completely arbitrary. Prof Ng gets to define them however he wants. What he chooses to do is to use the convention that all standalone vectors are formatted as column vectors, so that w is a column vector. Given the way he chooses to lay out the X sample matrices as n_x x m, you then require a transpose on w to do the linear combination.

In the case of the W matrices, he chooses to orient them differently and the transpose is no longer required.

To include LaTeX expressions, just bracket them with single dollar signs. This is covered on the FAQ Thread, q.v.

As to when to use elementwise versus dot product, notice that Prof Ng is consistent in using * to indicate elementwise when he is writing mathematical expressions. If he writes two operands adjacent with no explicit operator, he means real dot product style matrix multiplication. I think this latter choice is a bit unfortunate, but he’s the boss. E.g. I like to use the LaTeX \cdot operator, as in:

Z = w^T \cdot X + b

just to make things explicit.

As to the annoyance of switching from MATLAB to python + numpy, I feel your pain. MATLAB is beautiful and the polymorphism with which vectors and arrays are handled is designed in from scratch. Python wasn’t originally designed to do vectorized calculations, so everything to do with numpy feels like a bag on the side of a kludge by comparison to the beauty of MATLAB. But the world of ML/DL/AI has made this decision for us and we just have to deal with it. Sorry!

3 Likes