For practice I was trying out logistic regression on a model from Stanford’s Titanic dataset . Implementing the dataset to a ml model two errors occurred . The test/train prediction percentage is coming more than 9900 % and in a few iterations the value of cost function is coming either nan or -inf. What should I do to fix it?

There are lots of possibilities. Do you need to normalize the dataset first? If the various features have widely varying ranges, that may be necessary. It never hurts in general. Do you have other reasons to believe that the Logistic Regression code you are using is correct? I.e. that you passed the grader with your code?

The other cause of -Inf and Nan values for cost is “saturating” the sigmoid function with outputs that round to exactly 0 or 1. You might want to check for that. Typically that means you are using too high a learning rate or haven’t normalized your data and are getting “exploding gradients”.

Hi! Thanks for getting back to me. I did standardize my data by dividing the values by standard deviation of the whole set though. I have used the grader and my code did pass with a 100% grade. I feel confident about my code.

I did see the value saturating though when I basically looked into all the cost values, but I am not too sure on how to tackle this exact error. My learning rate is 0.005 , and I am positive that I did normalize my data.