However, when I tried to do it, the dot product failed because both A and Y are a (1,3) shape.
Then, I did an element wise multiplication and I could pass all the tests.
You need to understand how dot product multiplication works. It requires a transpose of the second argument in order for the dimensions to work for a “dot product”. Here’s a thread that shows examples.
Yes, I understand how the dimensions have to match for the dot product to be mathematically doable.
What I don’t understand is the vectorization of the J formula:
Therefore, I guess I should be using the dot product in my sigmoid but then why is it authorized to transpose the second matrix in the dot product. Other than, it must be done or it doesn’t work.
The point is that what you show is a mathematical formula. So what does that formula mean in terms of what actually happens? It is the sum of the products of the corresponding elements of two vectors, right? Well, what is the dot product of two vectors?
Of course it is always the case that there can be more than one correct way to translate a mathematical formula into linear algebra operations and then to python code. Your implementation with elementwise multiply, followed by a sum, is perfectly correct. It’s just that it’s less efficient because it is two separate vector operations. The dot product can do both operations (multiply and sum) in one vector operation.