Shouldn't we transpose w when taking dot product with a_in to respect matrix multiplication rules

sishita · September 15, 2024, 9:16pm

Hi, the code for my_dense function goes as:
def my_dense(a_in, W, b):
** units = W.shape[1]**
** a_out = np.zeros(units)**
** for j in range(units): **
** w = W[:,j] **
** z = np.dot(w, a_in) + b[j] **
** a_out[j] = g(z) **
** return(a_out)**

Matrix dot multiplication requires the number columns in the first matrix to be the same as number of rows in the 2nd. Shouldn’t the w in the np.dot(w, a_in) be transposed for this dot product to be valid? Or is this obfuscated in the code somehow?

Please help!

lukmanaj · September 15, 2024, 10:17pm

Hi @sishita ,
Regarding this, which course is this from?
In any case, transposing may not be required if the matrix is arranged in a way that it fits the rules.

Nevermnd · September 15, 2024, 10:21pm

@lukmanaj yes, that is what I was thinking. It depends on the shape you stack the weights (horizontal or vertical ?)-- and not every teacher, book does it the same.

In the courses here we don’t have to because shapes are properly opposing.

sishita · September 15, 2024, 10:33pm

This is from optional lab #2 from week 1 in course #2 “Advanced Learning Algorithms”. The shape of w is m,n & the shape of a_in is a numpy array of size m. Extracting col j from w yields a column vector with m rows, which per my understanding should be transposed before taking dot product with a_in of size m. Any explanation would be much appreciated!

Nevermnd · September 15, 2024, 10:41pm

@lukmanaj I don’t have access to this lab-- maybe you do?

lukmanaj · September 15, 2024, 10:45pm

In that case, I have adjusted the details using the pencil icon to reflect the course and the week number.

Also, to properly explain the code, it was arranged such that you only multiply vectors.

W is a weight matrix with shape (input_units, output_units)
W[:, j] selects the j-th column of W, which is a vector with shape (input_units,)
a_in is an input vector with shape (input_units,)
So when we do np.dot(w, a_in) , we actually compute the sum of the element-wise multiplications of the vectors. The result is a scalar (a single value), which is the intended behavior for each unit in the output. This is then assigned to a_out[j] = g(z). In the end, we compute for all the units and get our complete a_out, which was initiated as an array of zeros.
In this case, we do not need to transpose because we avoided doing matrix multiplication.
The code may not be optimized but it does the job.
Hope this clears out your concern.

lukmanaj · September 15, 2024, 10:48pm

Nope. I don’t have access too. But I saved up my notes from when I took the course.

sishita · September 15, 2024, 11:11pm

Thank you so much! Didn’t realize np.dot is element wise multiplication. I was mistaking it to be a matrix multiplication. Really appreciate it!

lukmanaj · September 15, 2024, 11:20pm

You’re welcome! I’m glad the explanation helped clarify things.

To add a bit more context:

np.dot can indeed perform different operations depending on the input dimensions:
- For 1D arrays (vectors): It computes the dot product (i.e., the sum of element-wise multiplications), resulting in a scalar.
- For 2D arrays (matrices): It performs standard matrix multiplication.
- For higher-dimensional arrays: It generalizes to tensor dot products.

So, you were partially correct in thinking it could be matrix multiplication — it just depends on the context of the input arrays! Thanks for the nuanced response, and feel free to ask if you have more questions.

sishita · September 16, 2024, 4:22am

Ah I see! For the 2nd & 3rd bullets where it is doing matrix multiplication, is there a difference between using np.dot vs np.matmul?

lukmanaj · September 16, 2024, 4:58am

For the 2nd and third bullet, we may need to transpose if the inner dimensions don’t match (number of columns of the first matrix and number of rows of second matrix). But if they match, no need to transpose.
np.matmul is for matrix multiplication while np.dot is more general purpose and behaves according to the nature of the arrays that are passed to it, as explained. For matrix multiplication, no much difference that I know of.

sishita · September 16, 2024, 1:49pm

Makes sense, thank you for the explanation!

Topic		Replies	Views
C2_W1_Assignment question Advanced Learning Algorithms week-1	2	428	August 22, 2023
General implementation of forward propagation - shape of W Advanced Learning Algorithms week-1	9	444	February 17, 2024
C2_W1_Assignment : Understanding the custom dense function Advanced Learning Algorithms week-1	5	266	January 25, 2024
Course 2 Week 1: Neural network implementation in python, dense function question Advanced Learning Algorithms week-1	1	448	July 23, 2023
Matrix multiplication lecture clarification - NN - Why do we transpose at all Advanced Learning Algorithms week-1	1	350	September 17, 2023

Shouldn't we transpose w when taking dot product with a_in to respect matrix multiplication rules

Related topics