Error in cost function programming

in the 4.3 section of the course, in exercise 5, we are asked to compute the activation function A and the cost function.
Here is what I proposed and submitted
A = sigmoid(np.dot(np.transpose(w),X) + b)
cost = -1/m*np.sum( np.dot(Y, np.log(A)) + np.dot( (1-Y) , np.log(1-A) ))

Here is the error I’ve got:
ValueError Traceback (most recent call last)
in
3 X =np.array([[1., 2., -1.], [3., 4., -3.2]])
4 Y = np.array([[1, 0, 1]])
----> 5 grads, cost = propagate(w, b, X, Y)
6
7 assert type(grads[“dw”]) == np.ndarray

in propagate(w, b, X, Y)
30 # YOUR CODE STARTS HERE
31 A = sigmoid(np.dot(np.transpose(w),X) + b)
—> 32 cost = -1/m*np.sum( np.dot(Y, np.log(A)) + np.dot( (1-Y) , np.log(1-A) ))
33
34 # YOUR CODE ENDS HERE

<array_function internals> in dot(*args, **kwargs)

ValueError: shapes (1,3) and (1,3) not aligned: 3 (dim 1) != 1 (dim 0)

Ichecked the dimension of the matrix and all seem to be Ok for me. I don’t see where the error is.
Thanks for your help!

@davidP,
I assume, the error is due to dotting on incorrect dimensions.
Do we need to dot or element wise multiplication assuming we need an output for each Y?
Another note to consider, does np.sum preserve dimensions? What is the expected output dimension?

Also, few suggestions:

  1. np.transpose(w) can be simplified as w.T
  2. if we have two 1 x N row vectors (say X, Y), then if
    2.1 if we need a scalar output, then we will use np.dot(X, Y.T)
    2.2 if we need a matrix output(N x N), then we will use np.dot(X.T, Y)
    2.3 if we need a 1 x N row vector output, then we will use X * Y (element wise)
2 Likes

Hi @davidP, consider also that what expect and what format is expected from numpy.dot is very well resumed in the numpy manual.
Hint: I would think on how input needs to be presented to numpy dot. Hope this can help you, keep going!

Did anyone solve this? I am stuck on exactly the same issue and some direction besides “think about how to present np.dot() with inputs” would be much appreciated. For example, the a and b arguments are both of size (1,3)… so why is there an issue multiplying to (1,3) vectors???

Hi @JTackett if the shape of the first item is (1,3) then you are have a datastructure (numpy array) with 2 dimensions (rank is 2) which means you are dealing with a matrix with 1 row and 3 columns. You can check the dimension of you np array with np.dim. This should suggest what should be the right shape of the second term, sure not again (1, 3).

Snipplet from the manual from the numpy manual I have linked before:

numpy. dot ( a , b , out=None ):

  • If both a and b are 2-D arrays, it is matrix multiplication,
1 Like

I’m not sure what “rank 2” means… and I’ve re-read that documentation multiple times. Basically, after hours of trial-and-error, the help I was looking for was “Hey, you should transpose the second term”. This is particularly frustrating because other NON-PYTHON languages allow you to input vectors of the same size.

C++ and MATLAB to name just a few languages allow you to do a dot() operation of two (1,3) vectors because under the hood the functions do the transpose for you. Python apparently takes a hard stance that the inputs must be mathematically correct and provides no simple error message to point you in the proper direction. The Jupyter interface further makes this difficult because you have to put in so many extra lines of code to debug.

I guess I would have wished for some more “hints” in the description above to point me in the transpose direction. Some of us have a long non-Python history with functions that do not impose the hard requirement and would have saved me hours on this lab.

For those finding this thread in the future, the fix to this issue is:
s1 = 1-Y
s2 = np.log(1-A)
part2 = np.dot(s1, s2.T)

Hi @JTackett I can understand your point. I have some experience with Matlab and there are some differences with numpy, see here for example.

Speaking about rank, it’s a definition for that numpy representation, more info here in the section around numpy and the arrays.

Does np.dim work with numpy? I couldn’t find it in the documentation. I only found the notes on np.shape().