Question on Week 3, 2.3 Exercise 4

Hello. I need help with the forward propagation function as the code is not compiling correctly in the nn_model as the cost printed out is the same on each line. The error message says there is an issue with the following line:

Z = W @ X + b

I understand that Z must be an array with shape 1 x m (in the first case 30 and second 1,460). In the first case, W is shape 1 x 1 as n_y = 1 and X is shape 1 x 30. After matrix multiplication and the addition of vector b, Z should result as an array of m columns. However, I am getting an error message saying there is an issue with the sizes when completing the n. Can someone please help me out?

Error Message: ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 2 is different from 1)

1 Like

Two suggestions:

  1. I recommend you avoid the @ operator. Instead use np.dot() or np.matmul() directly. That makes it more clear what operation you’re using.
  2. You often have to transpose one of the arguments in order to get the sizes to match.
1 Like

I will np.dot(), thank you.
Why should I use transpose? I was thinking to use transpose for W but it already has shape
n_y x n_x, or 1x1 and 1x2 in the two cases.

1 Like

There are lots of different test cases, those sizes are only one set you have to work with.

True, but in these cases the transposes do not work, whether I use the transpose of W or X.

1 Like

Can you show some evidence that they don’t work?

1 Like

ValueError Traceback (most recent call last)
in
----> 1 w3_unittest.test_forward_propagation(forward_propagation)

~/work/w3_unittest.py in test_forward_propagation(target_forward_propagation)
286
287 for test_case in test_cases:
→ 288 result = target_forward_propagation(test_case[“input”][“X”], test_case[“input”][“parameters”])
289
290 try:

in forward_propagation(X, Y)
18 # Implement Forward Propagation to calculate Z.
19 ### START CODE HERE ### (~ 2 lines of code)
—> 20 Z = np.dot(W.transpose(),X) + b
21 Y_hat = Z
22 ### END CODE HERE ###

<array_function internals> in dot(*args, **kwargs)

ValueError: shapes (1,1) and (2,5) not aligned: 1 (dim 1) != 2 (dim 0)

This is the error message I am receiving for using transpose.

An example off the top of my head is just looking off of the second example. The shape of W in the second part of this assignment is 1 x 2. That is, n_y = 1 and n_x = 2. The shape of X is 2 x 1460. So W x X is valid but X x W is not valid.

1 Like

A significant issue here is that w is a scalar (size 1,1), but X is size (2,5).
W and X must always have one common size - that is the number of features.

Where do your X and W come from? I’m not familiar with this assignment (Not a mentor for this course).

1 Like

This assignment is the SIngle Perceptron Neural Networks for Linear Regression in the Linear Algebra course for Machine Learning.

The W is an array of scalars, where w is the weight. The assignment refers to W as a matrix that needs to be multiplied by matrix X which represents the input nodes. X has size 1 * m in the first example. I am stuck on this multiplication in the step of defining the method, forward propagation.

This is the task: Implement Forward Propagation. Compute Z multiplying arrays w, X and adding vector b. Set the prediction array 𝐴 equal to 𝑍.

I think A is unnecessary here as it is never mentioned in the outline of the method. I am supposed to fill in the code for the line: Z = ?
I have written Z = np.dot(W, X) + b.

1 Like

In general, that’s the correct way to compute the predictions in a linear regression. In detail, you may need to transpose either W or X depending on how the data is stored (i.e. with the examples as the rows, or the examples as columns, and whether W is a matrix or a vector).

Then if it’s a layer in a neural network, then you typically use an activation function in each layer, such as sigmoid. In that case, you would have A = sigmoid(Z).

1 Like

And if W is a matrix, then you’d use np.matmul() instead of np.dot().

1 Like