I believe everything I implemented is fine. The only problem is with the cost function, it keeps outputting inf for some reason that I am not aware of. I believe this happens in the case of log(y) == log(0) or log(1-y) == log(0) which obviously divides by zero and thus results in inf. Any idea how to solve this problem?

If you are talking about the test cases for the `propagate`

function, there should be no cases in which you get log(0) given the test inputs that they give you. But perhaps this could indicate a problem with the A values that your code is generating. They are the output of sigmoid, so they should all be strictly between 0 and 1, right? Itâ€™s worth checking that is a true statement â€¦

Let me double check something. So I have (y) which is a 1D np array that contains zeroes and ones. Now there are two ways to calculate the cost function, I can just split them into a subset with class 0 and another with class 1, calculate the cost for each subset, and then aggregate these two values together. Or I could just use the combined formula to do it in one step. So if I am using the combined formula, I will end up getting log(0), either from log(y) or log(1-y). Am I understanding right or is there something I am missing?

Edit: Iâ€™ve just checked the sigmoid function and it is working as expected.

Figured it out

The problem was in the cost formula itself. I did log(Y) * A instead of log(A) * Y. Silly problem actually. But thanks for your efforts <3

Yes, that would do it! The point is that the A values are never exactly 0 or 1 (at least in mathematical terms), but of course the Y values are exactly 0 and 1 by definition.

You can run into the Inf or NaN problem if the sigmoid values â€śsaturateâ€ť to exactly 1 or 0. In 64 bit floating point z > 35 will give exactly 1 for sigmoid. But it wonâ€™t happen with the test cases here.