Dear sir, in the lectures sir said that **np.dot(w,x)+b** would give us the regression model but since these arrays are not compatible for multiplication(1*3 and 1*3) , how does it happen? Should it not be **w*b** as it would perform element wise multiplication? Does it has something to do with array broadcasting? Please help? Also in the previous course sir told us that we would get hypothesis function by the multiplication of transpose of parameter vector and variable vector (theta(transpose)*x) which was mathematically correct.

Please help me on this.

Hi, @Gopesh_Yadav!

The basic math behind neural networks (or specifically, regression models in this case) is the product of the input (or previous layer output) and the layer weights plus a bias term.

y_i = w_i \cdot x_i + b_i

\hat{\mathbf{y}} = \mathbf{w}^\top \mathbf{x} + \mathbf{b}

Hello @Gopesh_Yadav, thank you for the question!

Let’s focus on the maths and begin with some definitions:

\vec{w} = \begin{bmatrix} w_1 & w_2 \end{bmatrix} is a **row vector** of 2 weights

X = \begin{bmatrix} \vec{x}^{(1)} \\ \vec{x}^{(2)} \\ \vec{x}^{(3)} \end{bmatrix} is a **matrix** of 3 samples, where each sample is like, for example,

\vec{x}^{(1)} = \begin{bmatrix} x_1^{(1)} & x_2^{(1)} \end{bmatrix}, which is a **row vector** of 2 features.

When you **dot** 2 vectors, it is **not matrix multiplication**, so it is valid for us to write \vec{w} \cdot \vec{x}^{(1)} = w_1x_1^{(1)} + w_2x_2^{(1)}.

For multiplication between a matrix and a vector, it **is matrix multiplication**, and the vector will be identified as a matrix, and we need to match the shape as you said, so we need X \vec{w}^T. In this case, we get

X \vec{w}^T = \begin{bmatrix} \vec{x}^{(1)} \\ \vec{x}^{(2)} \\ \vec{x}^{(3)} \end{bmatrix} \vec{w}^T = \begin{bmatrix} \vec{x}^{(1)} \vec{w}^T \\ \vec{x}^{(2)} \vec{w}^T\\ \vec{x}^{(3)} \vec{w}^T\end{bmatrix} = \begin{bmatrix} w_1x_1^{(1)} + w_2x_2^{(1)} \\ w_1x_1^{(2)} + w_2x_2^{(2)} \\ w_1x_1^{(3)} + w_2x_2^{(3)} \end{bmatrix}

Here, for example, \vec{x}^{(1)} and \vec{w}^T are, respectively, identified as a row matrix and the transpose of another row matrix, so \vec{x}^{(1)} \vec{w}^T is a matrix multiplication of two matrices.

If you want to further ask question about a specific time of a lecture video, please include the name of the video and the timestamps.

Cheers!

Raymond

Thanks for replying Raymond,

Actually, that is what my doubt was - I do understand that in mathematical terms, **w.x** would give us dot product of the two-row vectors but in python np.dot(w,x) is used for matrix multiplication, isn’t it? I read in one of the quizzes that for element-wise multiplication of two matrices( or vectors, per se) we use **‘*’ operator?**

Please help me with this.

I am referring to the vectorization video of the second week in course 1 of the specialization.

Thanks

Hello Gopesh,

Let me quote the doc for np.dot, it can distinguish a dot product (inner product) from a matrix multiplication by examining the shape of the input arrays.

- If both
aandbare 1-D arrays, it is inner product of vectors (without complex conjugation).- If both
aandbare 2-D arrays, it is matrix multiplication, but using`matmul`

or`a @ b`

is preferred.- If either
aorbis 0-D (scalar), it is equivalent to`multiply`

and using`numpy.multiply(a, b)`

or`a * b`

is preferred.- If
ais an N-D array andbis a 1-D array, it is a sum product over the last axis ofaandb.- If
ais an N-D array andbis an M-D array (where`M>=2`

), …

Yes, and according to the doc, it says

Multiply arguments element-wise.

and

Equivalent to

x1*x2in terms of array broadcasting.

Let me know if you still have doubts.

Raymond

Oh! thanks for your explaination.

cheers!

Gopesh, you’re welcome!