Question on Week 3, 2.3 Exercise 4

gargi.bhogle · August 1, 2023, 10:23pm

Hello. I need help with the forward propagation function as the code is not compiling correctly in the nn_model as the cost printed out is the same on each line. The error message says there is an issue with the following line:

Z = W @ X + b

I understand that Z must be an array with shape 1 x m (in the first case 30 and second 1,460). In the first case, W is shape 1 x 1 as n_y = 1 and X is shape 1 x 30. After matrix multiplication and the addition of vector b, Z should result as an array of m columns. However, I am getting an error message saying there is an issue with the sizes when completing the n. Can someone please help me out?

Error Message: ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 2 is different from 1)

TMosh · August 1, 2023, 10:50pm

Two suggestions:

I recommend you avoid the @ operator. Instead use np.dot() or np.matmul() directly. That makes it more clear what operation you’re using.
You often have to transpose one of the arguments in order to get the sizes to match.

gargi.bhogle · August 1, 2023, 10:57pm

I will np.dot(), thank you.
Why should I use transpose? I was thinking to use transpose for W but it already has shape
n_y x n_x, or 1x1 and 1x2 in the two cases.

TMosh · August 1, 2023, 11:05pm

There are lots of different test cases, those sizes are only one set you have to work with.

gargi.bhogle · August 1, 2023, 11:07pm

True, but in these cases the transposes do not work, whether I use the transpose of W or X.

TMosh · August 1, 2023, 11:22pm

Can you show some evidence that they don’t work?

gargi.bhogle · August 1, 2023, 11:28pm

ValueError Traceback (most recent call last)
in
----> 1 w3_unittest.test_forward_propagation(forward_propagation)

~/work/w3_unittest.py in test_forward_propagation(target_forward_propagation)
286
287 for test_case in test_cases:
→ 288 result = target_forward_propagation(test_case[“input”][“X”], test_case[“input”][“parameters”])
289
290 try:

in forward_propagation(X, Y)
18 # Implement Forward Propagation to calculate Z.
19 ### START CODE HERE ### (~ 2 lines of code)
—> 20 Z = np.dot(W.transpose(),X) + b
21 Y_hat = Z
22 ### END CODE HERE ###

<array_function internals> in dot(*args, **kwargs)

ValueError: shapes (1,1) and (2,5) not aligned: 1 (dim 1) != 2 (dim 0)

This is the error message I am receiving for using transpose.

An example off the top of my head is just looking off of the second example. The shape of W in the second part of this assignment is 1 x 2. That is, n_y = 1 and n_x = 2. The shape of X is 2 x 1460. So W x X is valid but X x W is not valid.

TMosh · August 2, 2023, 1:17am

A significant issue here is that w is a scalar (size 1,1), but X is size (2,5).
W and X must always have one common size - that is the number of features.

Where do your X and W come from? I’m not familiar with this assignment (Not a mentor for this course).

gargi.bhogle · August 3, 2023, 8:34pm

This assignment is the SIngle Perceptron Neural Networks for Linear Regression in the Linear Algebra course for Machine Learning.

The W is an array of scalars, where w is the weight. The assignment refers to W as a matrix that needs to be multiplied by matrix X which represents the input nodes. X has size 1 * m in the first example. I am stuck on this multiplication in the step of defining the method, forward propagation.

This is the task: Implement Forward Propagation. Compute Z multiplying arrays w, X and adding vector b. Set the prediction array 𝐴 equal to 𝑍.

I think A is unnecessary here as it is never mentioned in the outline of the method. I am supposed to fill in the code for the line: Z = ?
I have written Z = np.dot(W, X) + b.

TMosh · August 3, 2023, 9:14pm

In general, that’s the correct way to compute the predictions in a linear regression. In detail, you may need to transpose either W or X depending on how the data is stored (i.e. with the examples as the rows, or the examples as columns, and whether W is a matrix or a vector).

Then if it’s a layer in a neural network, then you typically use an activation function in each layer, such as sigmoid. In that case, you would have A = sigmoid(Z).

TMosh · August 3, 2023, 9:16pm

And if W is a matrix, then you’d use np.matmul() instead of np.dot().

Topic		Replies	Views
C1_W3_Assign.(Error: matmul: Input operand 1..) Zero grade on submitting Linear Algebra for Machine Learning and Data Sc...	8	498	December 27, 2023
[Week 3, Problem 4.7] Neural Networks and Deep Learning week-module-3 , coursera-platform	3	263	January 8, 2024
Completely stuck nn_model step, tried every possible way i know, cannot move forward , Please help Linear Algebra for Machine Learning and Data Sc... week-module-3	6	504	May 29, 2023
C1_W3 _Assignment Linear Algebra for Machine Learning and Data Sc... week-module-3	6	556	April 23, 2023
C1_W3_Assignment - Forward Propagation (Single Perceptron Neural Networks for Linear Regression) Linear Algebra for Machine Learning and Data Sc... week-module-3	18	980	November 26, 2023

Question on Week 3, 2.3 Exercise 4

ValueError: shapes (1,1) and (2,5) not aligned: 1 (dim 1) != 2 (dim 0)

Related topics