Dot product vs element-wise multiplication of arrays

It sounds like you are saying that you don’t understand what a “dot product” is. Basic knowledge of linear algebra is a pre-requisite for this course. If you don’t understand how normal matrix multiply works (dot product style), you should go take one of the good online Linear Algebra courses to learn that. Here’s a thread that discusses this and gives some links.

Let’s forget python for a moment and just talk about the underlying math. If I have two vectors v and w with n elements, then this is what the dot product of v and w means:

v \cdot w = \displaystyle \sum_{i = 1}^n v_i * w_i

So you can see that it involves first the “elementwise product” of each of the corresponding elements of the two vectors, followed by the sum of those products. That’s what Tom and I mean about dot product involving a sum. It’s both: a product and a sum. That’s why it is more efficient, especially if you have a CPU with vector instructions (as essentially any modern CPU does).

Of course that is the simplest case of two vectors. Then when you do full matrix multiply, each output position in the resulting matrix is the dot product of one row of the first operand with one column of the second operand. So if A is n x k and B is k x m, then the product A \cdot B has dimensions n x m.

In the particular case of the linear activation calculation for logistic regression:

Z = w^T \cdot X + b

The dot product there is actually a matrix multiply with the first operand being a 1 x n_x vector (because of the transpose) and X having dimensions n_x x m. So the result has dimensions 1 x m.