Hello everyone,

I recently started out Neural Networks and Deep Learning and I have a question about the specific ecuation:

z = w.T * X+b

In the first exercise we flattened X into a vector with shape (features, examples) and initialized w as a matrix of zeros with dimensions (X.shape[0] (Number of features), 1).

Now my question is, why do we apply w.T, weren’t both w and X already compatible for multiplication since both had the same n of rows?

I hope my message gets trough as english is not my 1st language.

Have a great day!

Hello @Bruno_Catano_Arellan,

You said,

- X: (features, examples)
- w: (Number of features, 1)

Right? In this case, we need w.T as we can only do \mathbf{A}_{m \times n} \times \mathbf{B}_{n \times k}. Note how the two matrices share the same n.

Cheers,

Raymond

1 Like

It is just a convention that Prof Ng chooses to use that all standalone vectors are column vectors, so he formats w as n_x x 1. That requires the transpose in order for the dot product to work with X which is n_x x m, where m is the number of samples.

Note that when we get to Week 3 and full Neural Networks, the weights will become matrices and will be oriented such that the transpose will no longer be required.

Also note that you use * in your formula, but according to Prof Ng’s convention that means “elementwise” multiply. He writes it this way when he means “dot product” with no explicit operator:

Z = w^T X + b

Although I think it’s clearer to use the LaTeX “cdot” operator:

Z = w^T \cdot X + b

1 Like