Cost function in Week 2 Exercise 5

rohan2107 · July 6, 2021, 9:23am

My issue is regarding the propagate function, more specifically computing the cost. To vectorize it I initially used:
cost=-1(np.sum(np.dot(Y,np.log(A))+np.dot(1-Y,np.log(1-A))))/m*
But this gave me the following error:
ValueError: shapes (1,3) and (1,3) not aligned: 3 (dim 1) != 1 (dim 0)
After looking around the discourse, I found that the log terms had to be transposed, that is:
cost=-1(np.sum(np.dot(Y,np.log(A).T)+np.dot(1-Y,np.log(1-A).T)))/m*
I looked at more posts but I still do not understand why the log terms need to be transposed. I am aware of the matrix multiplication rule that states the number of columns of the first matrix must be equal to the number of columns of the second matrix, is that the reason for transposing or am I missing something else? Thank you.

albertovilla · July 6, 2021, 11:47am

Exactly, you need to transpose it otherwise you cannot multiply those terms.

rohan2107 · July 6, 2021, 12:12pm

Does this apply to all matrix operations where on transposing the matrix, dot product becomes possible?

gaussian · July 6, 2021, 3:33pm

Here’s how dot is defined in Numpy docs:

Numpy.dot( a , b , out=None )

Dot product of two arrays. Specifically,

If both a and b are 1-D arrays, it is inner product of vectors (without complex conjugation).

If both a and b are 2-D arrays, it is matrix multiplication, but using matmul or a @ b is preferred.

If either a or b is 0-D (scalar), it is equivalent to multiply and using numpy.multiply(a, b) or a * b is preferred.

If a is an N-D array and b is a 1-D array, it is a sum product over the last axis of a and b .

If a is an N-D array and b is an M-D array (where M>=2 ), it is a sum product over the last axis of a and the second-to-last axis of b .

Since we’re not using “rank 1 arrays” per Andrew Ng’s recommendation, the dot in our case is a matrix-matrix multiplication, so the dimensions of the two matrices need to match. Since the first matrix is a row vector, the second one needs to be a column vector, so you need to transpose the second one. Note that if the first one was a column vector and the second one was a row vector, the dot function would give us their outer product.

rohan2107 · July 7, 2021, 4:08am

Thank you, this was very helpful.

PD_Vaillancourt · July 8, 2023, 11:02pm

taking the transpose of np.log(A) is not necessary as Y and A are already the same shape (1 x 3). DId taking the transpose somehow work?

paulinpaloalto · July 9, 2023, 12:02am

The transpose is not necessary if you do an elementwise multiply, but it does not work to do a dot product between two 1 x 3 vectors, right? Try it and watch what happens.

Topic		Replies	Views
Programming assignment 1 completed - but still a problem w/ np.dot() in propagate Neural Networks and Deep Learning week-module-2 , coursera-platform	4	316	February 26, 2024
I'm getting an error in shapes when computing the cost Neural Networks and Deep Learning coursera-platform	5	412	August 15, 2024
week2_Exercise 5 - propagate Neural Networks and Deep Learning coursera-platform	6	365	September 30, 2023
Help with cost function Neural Networks and Deep Learning coursera-platform	20	984	May 22, 2023
Course 1 Week 2 Logistic Regression Cost function Neural Networks and Deep Learning coursera-platform	4	719	September 28, 2021

Cost function in Week 2 Exercise 5

Related topics