Week3, Programming assignment: 4.4 - Compute the Cost

Ah, ok, it helps to see the numbers. Now that I read your verbal description of your code again, I think I know what the problem is. It worries me that you say:

But both Y and A2 are row vectors, right? Meaning they are both 1 x m, where m is the number of samples. If you transpose Y, then it becomes m x 1, so you end up with the dot product being m x 1 dot 1 x m which gives an m x m output. That’s why you needed the np.sum, even though you are using the dot product. On the other hand if you transpose A2 instead, then your dot product will be 1 x m dot m x 1 which gives you a 1 x 1 output.

Notice that these operations are not commutative. Let’s construct a simple example to show what I mean:

>>> v = np.array([[1,2,3,4]])
>>> v.shape
(1, 4)
>>> w = v.T
>>> w.shape
(4, 1)
>>> w
array([[1],
       [2],
       [3],
       [4]])
>>> np.dot(v, v.T)
array([[30]])
>>> np.dot(v.T, v)
array([[ 1,  2,  3,  4],
       [ 2,  4,  6,  8],
       [ 3,  6,  9, 12],
       [ 4,  8, 12, 16]])
>>> np.sum(np.dot(v.T, v))
100
>>>

So you can see that doing it the way you did gives a completely different answer that has no relationship to the correct answer. If you check the v \cdot v^T case, you’ll see it’s the sum of the squares of the elements of v:

1 + 4 + 9 + 16 = 30

You should also recognize the result you get in the other case: it’s the multiplication table for the numbers 1 to 4, right? Adding that up gives a completely different value.

I tried making the mistake that I am theorizing you made and I get the exact same cost value you show:

cost = 2.0796608964759784
Error: Wrong output
 1  Tests passed
 1  Tests failed
10 Likes