I have two questions regarding the cost function with python programming.
I usually end up having error regarding the dimension of the output while trying to compute the cost function. then I need to make some adjustment but cannot seem to understand the logic behind it.
why can’t I use np.sum() instead of having to use + for the summation of the np.dot. I mean I am not sure how should I start to understand this whether which way can I choose.
It seems like sometimes I get an error and the way of fixing it is to do the transpose. Why do I need to transpose in order to get the dimension, right?
The test looks for the following criteria between your return value and the expected value:
Data type (match required)
Closeness of values (using np.allcoose)
Shape match (exact match required)
I recommend you fix the computation to get right correct values and then worry about the 3rd and 1st points.
Should these hints not help you, please click my name and message your notebook as an attachment.
There are two types of matrix or vector multiplication: “dot product” style and “elementwise”. These are fundamentally different mathematical operations. What we are always doing here is taking a mathematical formula and translating it into python code. So that requires that you first understand what the math says and then understand the functionality provided by the numpy calls you have at your disposal. Note that you need to have a good understanding of basic Linear Algebra as a prerequisite here. You don’t need to know what an eigenvalue is, but you definitely need to be comfortable with how dot product matrix multiply works.
So what is going on with the cost formula? The fundamental operation there is taking two 1 x m vectors, computing the products of the corresponding elements and then adding up those products. There are (at least) two ways I can think of to do that using numpy vector operations:
Use np.multiply or * (elementwise multiply) and then use np.sum to add up the products.
You can do both operations in one shot with np.dot to compute the dot product of the two vectors.
But note that in the dot product case, just dotting 1 x m with 1 x m does not make sense, right? So what do you need to do to make that work? If you’re not familiar with the rules for dot products, then you really should spend some time learning the basics of Linear Algebra first before continuing here. That is pretty fundamental and we’re only just getting rolling here: it doesn’t get easier or less complicated as we proceed through the courses here.
Here’s a thread which discusses the general issues in a bit more detail.
The one other point to make here is that “+” is not the same thing as np.sum when applied to two vectors or matrices. It is equivalent to np.add. You can read the documentation by googling “numpy sum” and “numpy add”.