Question on Exercise 5 Week 5 Programming Assignment

I am trying to figure out excersize 5 on th eprogramming assignment for week 2 but can’t figure out what I am doing wrong. The professor wrote that dw = 1/m * np.sum(dz) and dw = 1/m * X*dz_transpose, using np.dot to multiply matricies. For some reason my answers keep coming out way off. The original mathematical equations are βˆ‚π½βˆ‚π‘€=1π‘šπ‘‹(π΄βˆ’π‘Œ)𝑇 and βˆ‚π½βˆ‚π‘=1π‘šβˆ‘π‘–=1π‘š(π‘Ž(𝑖)βˆ’π‘¦(𝑖)).

Please show us the results and errors that you are getting when you run the test for propagate.

1 Like

Here is what I got. For cost I used the formula 𝐽=βˆ’1/π‘šβˆ‘(𝑦(𝑖)log(π‘Ž(𝑖))+(1βˆ’π‘¦(𝑖))log(1βˆ’π‘Ž(𝑖))) and for some reason. My copst function came out as β€œnan” so it is possible I did something wrong there.
dw = [[ 4.8 ]
[13.71]]
db = 0.36666666666666653
cost = nan

AssertionError Traceback (most recent call last)
in
17 print ("cost = " + str(cost))
18
β€”> 19 propagate_test(propagate)

~/work/release/W2A2/public_tests.py in propagate_test(target)
49 assert type(grads[β€˜dw’]) == np.ndarray, f"Wrong type for grads[β€˜dw’]. {type(grads[β€˜dw’])} != np.ndarray"
50 assert grads[β€˜dw’].shape == w.shape, f"Wrong shape for grads[β€˜dw’]. {grads[β€˜dw’].shape} != {w.shape}"
β€”> 51 assert np.allclose(grads[β€˜dw’], expected_dw), f"Wrong values for grads[β€˜dw’]. {grads[β€˜dw’]} != {expected_dw}"
52 assert np.allclose(grads[β€˜db’], expected_db), f"Wrong values for grads[β€˜db’]. {grads[β€˜db’]} != {expected_db}"
53 assert np.allclose(cost, expected_cost), f"Wrong values for cost. {cost} != {expected_cost}"

AssertionError: Wrong values for grads[β€˜dw’]. [[ 5.55 ]
[14.985]
[ 5.985]] != [[-0.03909333]
[ 0.12501464]
[-0.99960809]]

Expected output

dw = [[ 0.25071532]
[-0.06604096]]
db = -0.1250040450043965
cost = 0.15900537707692405

When I change all matrix multiplication to np.dot when computing cost I get the error as shown below.

ValueError Traceback (most recent call last)
in
6 X = np.array([[1., -2., -1.], [3., 0.5, -3.2]])
7 Y = np.array([[1, 1, 0]])
----> 8 grads, cost = propagate(w, b, X, Y)
9
10 assert type(grads[β€œdw”]) == np.ndarray

in propagate(w, b, X, Y)
34 A = np.dot(np.transpose(w), X) + b
35
β€”> 36 cost = np.dot((-1 / m), np.sum(np.dot(Y, np.log(A)) + np.dot((1 - Y), np.log(1 - A))))
37 # YOUR CODE ENDS HERE
38

<array_function internals> in dot(*args, **kwargs)

ValueError: shapes (1,3) and (1,3) not aligned: 3 (dim 1) != 1 (dim 0)

Expected output

dw = [[ 0.25071532]
[-0.06604096]]
db = -0.1250040450043965
cost = 0.15900537707692405

You don’t need dot products to compute the cost. You can use them, but if you do, then you need to be cognizant of the rules for dot products: the inner dimensions need to agree, right? So if you have two (1,3) vectors, then you need to transpose one of them. And the order of the operation matters: please have a look at this thread to understand why.

But if you use np.dot, then the advantage of that is that it does both the multiplication and the addition in one shot. So you don’t need the np.sum in that case. And you never need np.dot to multiply by a constant like -\frac {1}{m}.

The other way to solve the cost is to use elementwise multiply (β€œ*”) between the pairs of vectors and then use np.sum to add up all the products. That might be more straightforward and that way does not require any transposes.

4 Likes

I am still very stuck with this question.

I am trying to do the math myself as I am having a hard time visualizing it.
If w = [[1], [2]] a (2x1) matrix
X = [[1, -2, -1], [3, 0.5, -3.2]] a (2x3) matrix
To compute A via the formula for A, I would take the transpose of w meaning w.T would become [1, 2] which is a (1x2) matrix. w.T o X means that w.T has a shape of (1x2) and X has a shape of (2x3) resulting in a matrix of shape (1x3) and after adding b [8.5, 0.5, -5.9].

I am getting stuck when I take the log of A. taking the log of -5.9 results in the imaginary number -0.301+1.364*i which is an imaginary number. I am not sure what I am doing wrong here.

But A is the output of sigmoid, right? So all the values should be between 0 and 1 and have logs between -\infty and 0.

1 Like

Oh do you mean apply the sigmoid formula to the numbers in the matrix?
so [1/(1+e^-(8.5), …] resulting in a matrix of [0.997, 0.622, 0.0027].

1 Like

Look at the formula for A in the instructions. We wrote the sigmoid function earlier. Just call that function with Z as the input:

Z = w^T \cdot X + b
A = sigmoid(Z)

1 Like