Question on Exercise 5 Week 5 Programming Assignment

reinhardt_scott · January 8, 2025, 8:35pm

I am trying to figure out excersize 5 on th eprogramming assignment for week 2 but can’t figure out what I am doing wrong. The professor wrote that dw = 1/m * np.sum(dz) and dw = 1/m * X*dz_transpose, using np.dot to multiply matricies. For some reason my answers keep coming out way off. The original mathematical equations are ∂𝐽∂𝑤=1𝑚𝑋(𝐴−𝑌)𝑇 and ∂𝐽∂𝑏=1𝑚∑𝑖=1𝑚(𝑎(𝑖)−𝑦(𝑖)).

paulinpaloalto · January 8, 2025, 8:37pm

Please show us the results and errors that you are getting when you run the test for propagate.

reinhardt_scott · January 8, 2025, 8:58pm

Here is what I got. For cost I used the formula 𝐽=−1/𝑚∑(𝑦(𝑖)log(𝑎(𝑖))+(1−𝑦(𝑖))log(1−𝑎(𝑖))) and for some reason. My copst function came out as “nan” so it is possible I did something wrong there.
dw = [[ 4.8 ]
[13.71]]
db = 0.36666666666666653
cost = nan

AssertionError Traceback (most recent call last)
in
17 print ("cost = " + str(cost))
18
—> 19 propagate_test(propagate)

~/work/release/W2A2/public_tests.py in propagate_test(target)
49 assert type(grads[‘dw’]) == np.ndarray, f"Wrong type for grads[‘dw’]. {type(grads[‘dw’])} != np.ndarray"
50 assert grads[‘dw’].shape == w.shape, f"Wrong shape for grads[‘dw’]. {grads[‘dw’].shape} != {w.shape}"
—> 51 assert np.allclose(grads[‘dw’], expected_dw), f"Wrong values for grads[‘dw’]. {grads[‘dw’]} != {expected_dw}"
52 assert np.allclose(grads[‘db’], expected_db), f"Wrong values for grads[‘db’]. {grads[‘db’]} != {expected_db}"
53 assert np.allclose(cost, expected_cost), f"Wrong values for cost. {cost} != {expected_cost}"

AssertionError: Wrong values for grads[‘dw’]. [[ 5.55 ]
[14.985]
[ 5.985]] != [[-0.03909333]
[ 0.12501464]
[-0.99960809]]

Expected output

dw = [[ 0.25071532]
[-0.06604096]]
db = -0.1250040450043965
cost = 0.15900537707692405

reinhardt_scott · January 8, 2025, 9:13pm

When I change all matrix multiplication to np.dot when computing cost I get the error as shown below.

ValueError Traceback (most recent call last)
in
6 X = np.array([[1., -2., -1.], [3., 0.5, -3.2]])
7 Y = np.array([[1, 1, 0]])
----> 8 grads, cost = propagate(w, b, X, Y)
9
10 assert type(grads[“dw”]) == np.ndarray

in propagate(w, b, X, Y)
34 A = np.dot(np.transpose(w), X) + b
35
—> 36 cost = np.dot((-1 / m), np.sum(np.dot(Y, np.log(A)) + np.dot((1 - Y), np.log(1 - A))))
37 # YOUR CODE ENDS HERE
38

<array_function internals> in dot(*args, **kwargs)

ValueError: shapes (1,3) and (1,3) not aligned: 3 (dim 1) != 1 (dim 0)

Expected output

dw = [[ 0.25071532]
[-0.06604096]]
db = -0.1250040450043965
cost = 0.15900537707692405

paulinpaloalto · January 8, 2025, 9:18pm

You don’t need dot products to compute the cost. You can use them, but if you do, then you need to be cognizant of the rules for dot products: the inner dimensions need to agree, right? So if you have two (1,3) vectors, then you need to transpose one of them. And the order of the operation matters: please have a look at this thread to understand why.

But if you use np.dot, then the advantage of that is that it does both the multiplication and the addition in one shot. So you don’t need the np.sum in that case. And you never need np.dot to multiply by a constant like -\frac {1}{m}.

The other way to solve the cost is to use elementwise multiply (“*”) between the pairs of vectors and then use np.sum to add up all the products. That might be more straightforward and that way does not require any transposes.

reinhardt_scott · January 9, 2025, 7:49pm

I am still very stuck with this question.

I am trying to do the math myself as I am having a hard time visualizing it.
If w = [[1], [2]] a (2x1) matrix
X = [[1, -2, -1], [3, 0.5, -3.2]] a (2x3) matrix
To compute A via the formula for A, I would take the transpose of w meaning w.T would become [1, 2] which is a (1x2) matrix. w.T o X means that w.T has a shape of (1x2) and X has a shape of (2x3) resulting in a matrix of shape (1x3) and after adding b [8.5, 0.5, -5.9].

I am getting stuck when I take the log of A. taking the log of -5.9 results in the imaginary number -0.301+1.364*i which is an imaginary number. I am not sure what I am doing wrong here.

paulinpaloalto · January 9, 2025, 7:51pm

But A is the output of sigmoid, right? So all the values should be between 0 and 1 and have logs between -\infty and 0.

reinhardt_scott · January 9, 2025, 8:22pm

Oh do you mean apply the sigmoid formula to the numbers in the matrix?
so [1/(1+e^-(8.5), …] resulting in a matrix of [0.997, 0.622, 0.0027].

paulinpaloalto · January 9, 2025, 8:33pm

Look at the formula for A in the instructions. We wrote the sigmoid function earlier. Just call that function with Z as the input:

Z = w^T \cdot X + b
A = sigmoid(Z)

Topic		Replies	Views
Help! Exercise 5 not working! Neural Networks and Deep Learning	3	551	January 8, 2022
Course 1 --> week 2 --> Exercise 5 Neural Networks and Deep Learning	7	665	October 2, 2023
Assignment week2 exercise5 propagate Neural Networks and Deep Learning	3	621	June 14, 2021
Week 2 programming exercise - cost calculation Neural Networks and Deep Learning	2	507	November 23, 2022
Course 1 - Week 2 - Exercise 5 - propagate Neural Networks and Deep Learning	4	721	April 15, 2022

Question on Exercise 5 Week 5 Programming Assignment

Here is what I got. For cost I used the formula 𝐽=−1/𝑚∑(𝑦(𝑖)log(𝑎(𝑖))+(1−𝑦(𝑖))log(1−𝑎(𝑖))) and for some reason. My copst function came out as “nan” so it is possible I did something wrong there. dw = [[ 4.8 ] [13.71]] db = 0.36666666666666653 cost = nan

When I change all matrix multiplication to np.dot when computing cost I get the error as shown below.

Related topics

Here is what I got. For cost I used the formula 𝐽=−1/𝑚∑(𝑦(𝑖)log(𝑎(𝑖))+(1−𝑦(𝑖))log(1−𝑎(𝑖))) and for some reason. My copst function came out as “nan” so it is possible I did something wrong there.
dw = [[ 4.8 ]
[13.71]]
db = 0.36666666666666653
cost = nan