What's the intuition of defining Weight matrix with features as column vector in the numpy implementation?

ismareth · June 18, 2025, 7:55am

I am currently going through the Week 1 material of the 2nd course in the Machine Learning Specialization, and this is the first time I have come across W matrix defined with feature params (w1_1, w1_2, w1_3..) as column vector.

For example, in the CoffeeRoastingNumPy Optional Lab, W was defined as :
W = np.array( [[-8.93, 0.29, 12.9 ], [-0.1, -7.32, 10.81]] )
and, units = W.shape[1]

What’s the intuition for this ?

Instead, why not follow the same structure as in Course 1 ? Like :
W1 = np.array( [[-8.93, -0.1], [0.29, -7.32], [12.9, 10.81]] )
and, units = W.shape[0]

In this example, Training Set X is set as :
X = np.array([
[200,13.9],
[200,17]])

FYI : I tried out a row-vector implementation for W, and it gave same results. So I just want to understand if there is a good reason for using one approach vs the other.

Follow-up [Minor] :
Need help with an error as I am new python numpy and tensorflow libraries.
When I run the same code module as the Optional Lab on my personal laptop, I get a " DeprecationWarning: Conversion of an array with ndim > 0 to a scalar is deprecated," (screenshot attached). Anything I found online didn’t make a lot of sense to me, so I would appreciate some help here ?

Thanks a lot !! Cheers

Alireza_Saei · June 18, 2025, 8:53am

Hi @ismareth

For the matrix shape: using weights W with shape (num_outputs, num_inputs) (i.e., features as columns) aligns with the math for matrix multiplication in vectorized implementations Z = W @ X + b which is more efficient and standard in many ML frameworks. Both layouts work, but this one often simplifies broadcasting and batch operations.

As for the deprecation warning: you’re likely trying to assign a NumPy array to a scalar (p[i,0] = ...) without using .item()—just replace p[i,0] = my_sequential(...) with p[i,0] = my_sequential(...).item() to fix it.

Hope it helps! Feel free to ask if you need further assistance.

TMosh · June 18, 2025, 5:14pm

There is no universal standard in the ML industry for the orientation of either the X or W matrices. You’ll find both possible orientations used in equal measure.

ismareth · June 18, 2025, 10:00pm

Thank you for your response.
Your solution to the deprecation warning worked for me as well.

ismareth · June 18, 2025, 10:03pm

However, I didn’t quite understand your explanation for using columns for features about “simplifies broadcasting and batch operations”. I hope it becomes clearer in next week’s lectures.

Alireza_Saei · June 19, 2025, 12:16am

You’re welcome! Yes, but think of it this way: organizing weights as (outputs × inputs) lets you process batches where each input is a column and enables a clean Z = W @ X + b operation. It’s mainly about efficient computation across batches.

Topic		Replies	Views
General implementation of forward propagation - shape of W Advanced Learning Algorithms week-module-1	9	452	February 17, 2024
Implementation of forward prop in numpy(https://www.coursera.org/learn/advanced-learning-algorithms/lecture/fZYiN/general-implementation-of-forward-propagation) Advanced Learning Algorithms week-module-1	2	25	May 11, 2025
Matrix multiplication on coffee roast optional lab Advanced Learning Algorithms week-module-1	7	259	June 14, 2024
Week 3 "Explanation for Vectorized Implementation" Neural Networks and Deep Learning week-module-3 , coursera-platform	3	191	March 21, 2024
Matrix lay out in the tensorflow Advanced Learning Algorithms week-module-1	6	292	January 24, 2024

What's the intuition of defining Weight matrix with features as column vector in the numpy implementation?

Related topics