In public_tests.py
in the compute_cost_reg_test(target)
function definition there is the following code:
In the lectures Professor Andrew Ng specifically said:
“Since y takes only values 0 or 1, we can write the simplified formula for the cost function…”
Here, y
takes values 0.5.
This might be a problem for people who have implemented their cost function using an approach like this:
-np.sum(np.log(f)[y==1]) - ...
Filtering by y
equality with 1
and 0
completely ignores y=0.5
from the test case, which can result in frustration and a lack of understanding where you went wrong.
Furthermore, filtering by y
equality with 1
and 0
, can surprisingly be a better approach than a simple dot product
(or a for loop
equivalently):
... - np.log(1-f) @ (1-y)
The reason behind this is that for some training case
z
can happen to be a relatively high value, like 38,
and f == sigmoid(38) == 1.0
in python,
which leads to np.log(1-f) == np.inf
.
And since np.inf * 0 == nan
,
this can result in np.log(1-f) @ (1-y) == nan
– and the whole sum and cost function will be nan
- only because of one such training case!
This is undesirable if you want to track the cost function.
So that is why I think it is better to implement the cost function using filtering on y
, and therefore all the test cases where y
takes some values other than 0
or 1
should be removed from the course or rewritten.