I try to build Logistic Regression model for the Breast Cancer dataset. I use Course 1 Week 3 Logistic Regression Assignment. At the step ‘Compute and display cost with non-zero w’ a get the ‘nan’ output:
It looks like I make mistake in my code but I can’t recognize it
My sigmoid() function works correctly:
The size of my ‘y_train’ array and y_train values are:
‘Is the number of ‘w’ values correct for the number of features in the x_train dataset?’:
the number of features in the x_train dataset is 30. So I put 30 ‘w’ values:
Tom, thank you for answers. I’ve modified the code (I’ve decided to use for-loop and not use dot(). But I’ve come back to the first issue: after running the code for non-zero w-values my cost function has ‘nan’ value.
“nan” means Not a Number. It means your code is trying to do something mathematically impossible. The most common issue is trying to compute log(0), since that’s undefined (not a really number).
The problem may be with your choice of the initial weight values. Because of the way the logistic cost function works (computing the log of values that may be very close to zero), you cannot simply pick initial weights and expect they will work. A choice of weights that gives a value numerically close to zero (within your machine’s number representation limits) will cause log(0) to blow up.
What happens if you try all-zeros for the initial weight values? Using zeros for the initial logistic weights is a good idea, because the sigmoid of 0 = 0.5 regardless of the feature values, and that’s safely far away from 0.0 regardless of the size of the data set.
So please try all-zeros for the initial weights, and report back.
Should I always choose the w-values all zeros during training ML models?
When I use the code suggested in the final assignment (Week 3) with random setting of the initial w-values multiplied by 0.01 my cost function returns ‘nan’.
What value are you using for the initial ‘b’ value?
Does your code give good results after training if you use all-zeros as the initial weights (including ‘b’)?
For logistic regression, you really can’t just guess at what the best initial weight values might be. Because there are a lot of features in the data set you’re using, you can get in trouble if you just pick random or arbitrary weight values (especially if their mean value isn’t zero).
Using all-zeros (for both w and b) is a safe choice, because the starting f_wb value is always going to be sigmoid(0) = 0.5
I’ve rewritten the original assignment code ‘0.01 * (np.random.rand(2).reshape(-1,1) - 0.5)’ as ‘0.01 * (np.zeros(n).reshape(-1,1) - 0.5)’ and initial b-value is -8. In this case cost = ‘nan’.
Thank you, Tom! Your answers helped me to understand Logistic Regression better.
When I use the learning rate = 0.0001 the cost function doesn’t take ‘Nan’ values. Using the rate = 0.00001 leads to the better results of the cost function. While predicted values are still all ones