General implementation of forward propagation - shape of W

kc_43 · August 9, 2023, 10:08pm

Hi Machine Learning Specialization community,
It’s not clear to me why W is intentionally stack as a (2,3) shape and not (3, 2). I thought it was due to matrix multiplication where the column of the first matrix must be the same as the row of the second matrix. However, w is pulled out from W as an 1-D vector for np.dot in the dense function so it seems like the (2,3) shape for W is not necessary(?) Not sure if I’m thinking about this correctly. Any thoughts or insights would be appreciated!

TMosh · August 9, 2023, 10:21pm

Which assignment or lecture are you looking at? Please be specific.

kc_43 · August 9, 2023, 10:29pm

Hi TMosh,

Thank you for the quick response. The specific lecture I am referring to the “General implementation of forward propagation” lecture (Path to lecture: Advanced Learning Algorithms > Week 1 > General implementation of forward propagation) Please let me know if I can further clarify.

TMosh · August 9, 2023, 11:22pm

What is the time mark within that lecture video?

TMosh · August 9, 2023, 11:30pm

Assuming it is around time mark 5:40.

There are a few thing going on here.

First, in this example, Andrew has assigned the size of W as:
rows: the number of inputs to the layer
columns: the number of outputs from the layer

So, in this example, for W1 there are two input units, and three hidden layer units, so the size is (2 x 3).

Note that this convention for the format of a weight matrix is not universal or consistent. You can just as easily reverse the two. When it comes to matrices, transpositions are your friend.

Now, given the implementation of “dense()” uses a for-loop over the hidden layer units, and the using np.dot() to compute a vector product (of w and a), you don’t strictly need W to be a matix. You could instead have three separate ‘w’ vectors.

However, this gets confusing and isn’t very efficient or expandable to other sizes of NN.
It’s a more general solution if you have one W matrix for each layer.

Now given that W is a matrix, you can use np.dot() to compute the product of W and a, and avoid the inefficient for-loop entirely.

kc_43 · August 10, 2023, 2:52am

HI TMosh,

Thank you for the clarification. Appreciate the prompt feedback!

Christina_Fan · February 17, 2024, 1:23am

Hi Tom,

Further to your explanation in the above time mark 5:40, could you please help me to understand the following:

why W stacked and read in column not row while previous slid shown w1_1, w1_2 and w1_3 in row vector? can W be stacked and read in row? eg shown in my 2nd screenshot in pink, thereby for loop would write w=W[i, :]

image1061×552 133 KB

I understand W is 2 by 3 matrix, in W.shape[1], what .shape[1] means here and why W.shape[1] is 3?

Many thanks
Christina

TMosh · February 17, 2024, 1:43am

That matrix is formatted so that the features are in the rows, and the examples are in the columns.

It’s backward from nearly every other assigment in the course.

Christina_Fan · February 17, 2024, 2:20am

Thank you for the quick reply as always.

Sorry I don’t fully understand your response. Would you please address to my question 1 by 1 so that I can better understand.

My question is related to the video General implementation of forward propagation in Numpy (I’m not referring to any assignment here)

Thank you
Christina

TMosh · February 17, 2024, 2:42am

I think I did answer your question, because the assignment has the same inconsistency as the lecture.

Topic		Replies	Views
Implementation of forward prop in numpy(https://www.coursera.org/learn/advanced-learning-algorithms/lecture/fZYiN/general-implementation-of-forward-propagation) Advanced Learning Algorithms week-module-1	2	19	May 11, 2025
Forward prop in Numpy: Regulation shape of Matrix W Advanced Learning Algorithms week-module-1	1	551	December 24, 2022
What's the intuition of defining Weight matrix with features as column vector in the numpy implementation? Advanced Learning Algorithms week-module-1	5	32	June 19, 2025
Cant understand a matrix Neural Networks and Deep Learning coursera-platform	5	1263	March 8, 2024
Size of W in [9] in C2_W1_Lab02_CoffeeRoasting_TF? Advanced Learning Algorithms week-module-1	5	562	October 19, 2022

General implementation of forward propagation - shape of W

Related topics