I am trying to figure out excersize 5 on th eprogramming assignment for week 2 but canβt figure out what I am doing wrong. The professor wrote that dw = 1/m * np.sum(dz) and dw = 1/m * X*dz_transpose, using np.dot to multiply matricies. For some reason my answers keep coming out way off. The original mathematical equations are βπ½βπ€=1ππ(π΄βπ)π and βπ½βπ=1πβπ=1π(π(π)βπ¦(π)).
Please show us the results and errors that you are getting when you run the test for propagate
.
Here is what I got. For cost I used the formula π½=β1/πβ(π¦(π)log(π(π))+(1βπ¦(π))log(1βπ(π))) and for some reason. My copst function came out as βnanβ so it is possible I did something wrong there.
dw = [[ 4.8 ]
[13.71]]
db = 0.36666666666666653
cost = nan
AssertionError Traceback (most recent call last)
in
17 print ("cost = " + str(cost))
18
β> 19 propagate_test(propagate)
~/work/release/W2A2/public_tests.py in propagate_test(target)
49 assert type(grads[βdwβ]) == np.ndarray, f"Wrong type for grads[βdwβ]. {type(grads[βdwβ])} != np.ndarray"
50 assert grads[βdwβ].shape == w.shape, f"Wrong shape for grads[βdwβ]. {grads[βdwβ].shape} != {w.shape}"
β> 51 assert np.allclose(grads[βdwβ], expected_dw), f"Wrong values for grads[βdwβ]. {grads[βdwβ]} != {expected_dw}"
52 assert np.allclose(grads[βdbβ], expected_db), f"Wrong values for grads[βdbβ]. {grads[βdbβ]} != {expected_db}"
53 assert np.allclose(cost, expected_cost), f"Wrong values for cost. {cost} != {expected_cost}"
AssertionError: Wrong values for grads[βdwβ]. [[ 5.55 ]
[14.985]
[ 5.985]] != [[-0.03909333]
[ 0.12501464]
[-0.99960809]]
Expected output
dw = [[ 0.25071532]
[-0.06604096]]
db = -0.1250040450043965
cost = 0.15900537707692405
When I change all matrix multiplication to np.dot when computing cost I get the error as shown below.
ValueError Traceback (most recent call last)
in
6 X = np.array([[1., -2., -1.], [3., 0.5, -3.2]])
7 Y = np.array([[1, 1, 0]])
----> 8 grads, cost = propagate(w, b, X, Y)
9
10 assert type(grads[βdwβ]) == np.ndarray
in propagate(w, b, X, Y)
34 A = np.dot(np.transpose(w), X) + b
35
β> 36 cost = np.dot((-1 / m), np.sum(np.dot(Y, np.log(A)) + np.dot((1 - Y), np.log(1 - A))))
37 # YOUR CODE ENDS HERE
38
<array_function internals> in dot(*args, **kwargs)
ValueError: shapes (1,3) and (1,3) not aligned: 3 (dim 1) != 1 (dim 0)
Expected output
dw = [[ 0.25071532]
[-0.06604096]]
db = -0.1250040450043965
cost = 0.15900537707692405
You donβt need dot products to compute the cost. You can use them, but if you do, then you need to be cognizant of the rules for dot products: the inner dimensions need to agree, right? So if you have two (1,3) vectors, then you need to transpose one of them. And the order of the operation matters: please have a look at this thread to understand why.
But if you use np.dot
, then the advantage of that is that it does both the multiplication and the addition in one shot. So you donβt need the np.sum
in that case. And you never need np.dot
to multiply by a constant like -\frac {1}{m}.
The other way to solve the cost is to use elementwise multiply (β*β) between the pairs of vectors and then use np.sum
to add up all the products. That might be more straightforward and that way does not require any transposes.
I am still very stuck with this question.
I am trying to do the math myself as I am having a hard time visualizing it.
If w = [[1], [2]] a (2x1) matrix
X = [[1, -2, -1], [3, 0.5, -3.2]] a (2x3) matrix
To compute A via the formula for A, I would take the transpose of w meaning w.T would become [1, 2] which is a (1x2) matrix. w.T o X means that w.T has a shape of (1x2) and X has a shape of (2x3) resulting in a matrix of shape (1x3) and after adding b [8.5, 0.5, -5.9].
I am getting stuck when I take the log of A. taking the log of -5.9 results in the imaginary number -0.301+1.364*i which is an imaginary number. I am not sure what I am doing wrong here.
But A is the output of sigmoid, right? So all the values should be between 0 and 1 and have logs between -\infty and 0.
Oh do you mean apply the sigmoid formula to the numbers in the matrix?
so [1/(1+e^-(8.5), β¦] resulting in a matrix of [0.997, 0.622, 0.0027].
Look at the formula for A in the instructions. We wrote the sigmoid
function earlier. Just call that function with Z as the input:
Z = w^T \cdot X + b
A = sigmoid(Z)