Hi, Natalie.
That’s great news that you found the solution. Note that the way you wrote that prediction code would actually be correct if you wrote the network the way Prof Ng usually does, which is to omit the sigmoid
(or softmax
as appropriate) at the output layer and then use the from_logits = True
mode of the cross entropy loss function (all versions of cross entropy support that). That’s because in that case, you actually do end up with the prediction outputs being “logits”, meaning the pre-sigmoid values in the range (-\infty, \infty) and where > 0 means “true”.
The reason it is common to do it that way is explained on this thread. But it does then make it a bit more of a hassle to use the prediction values, because you need to manually apply sigmoid
(or softmax
). Your “lambda” function implementation would be a nice way to solve that problem, if you were in “logits” mode.
Best regards,
Paul