Course 1 Week 2 Logistic Regression Cost function

Hey everyone !

Something I can’t warp my head around:

In the comments when coding the cost function, it is said: compute cost using np.dot

cost = -(1/m)*np.sum(np.dot(Y,np.log(A))+np.dot((1-Y),np.log(1-A)))

However, when I tried to do it, the dot product failed because both A and Y are a (1,3) shape.
Then, I did an element wise multiplication and I could pass all the tests.

cost = -(1/m)*np.sum(Y*np.log(A)+(1-Y)*np.log(1-A))

Can someone give me a hint ?

Thanks !

You need to understand how dot product multiplication works. It requires a transpose of the second argument in order for the dimensions to work for a “dot product”. Here’s a thread that shows examples.

Yes, I understand how the dimensions have to match for the dot product to be mathematically doable.

What I don’t understand is the vectorization of the J formula:

image

Therefore, I guess I should be using the dot product in my sigmoid but then why is it authorized to transpose the second matrix in the dot product. Other than, it must be done or it doesn’t work.

The point is that what you show is a mathematical formula. So what does that formula mean in terms of what actually happens? It is the sum of the products of the corresponding elements of two vectors, right? Well, what is the dot product of two vectors?

Of course it is always the case that there can be more than one correct way to translate a mathematical formula into linear algebra operations and then to python code. Your implementation with elementwise multiply, followed by a sum, is perfectly correct. It’s just that it’s less efficient because it is two separate vector operations. The dot product can do both operations (multiply and sum) in one vector operation.

@paulinpaloalto thanks for the clarification.

I also now realized that the element wise multiplication doesn’t automatically include the sum.

Thanks again !