Setting lambda=0 in C1_W3 does not produce the provided overfitting plot. Any ideas why?

Dear all,

if I set lambda_ = 0 and iterations = 5000 in the final assignment C1_W3 at “3.6 Learning parameters using gradient descent” I don’t receive the result of the plot that was provided by “week-3-practice-lab-logistic-regression/images/figure 5.png”.

The resulting costs are 0.5. The resulting plot looks like this:

If you could find time to answer I will be very happy.

Setting lambda_ = 100 and iterations = 5000 works as expected generating an underfitted plot.

Thank you!!

I’m using the Notebook which was updated on Dec. 21, and I don’t see any Figures 5 or 6.

1 Like

Update: Figures 5 and 6 are not applicable to this assignment.

That’s because they were imported from the original version of the ML course, which used Octave/MATLAB and an advanced minimizer function. This minimizer gives different results than you get using this notebook’s fixed-rate gradient descent method.

1 Like

Hello Daniel! @Daniel_Blossey

I didn’t realize there is a figure 5! However, if other minimizer can get to that, it’s also possible for our gradient descent. The challenge is how we guide the gradient descent. I made my life easier by first normalizing the features. Here is the recipe if you want to reproduce a plot like that

# Initialize fitting parameters
np.random.seed(1)
initial_w = np.random.rand(X_mapped.shape[1])-0.5
initial_b = 1.

# Set regularization parameter lambda_ to 1 (you can try varying this)
lambda_ = 0.;                                          
# Some gradient descent settings
iterations = 20000
alpha = 2.

# Normalize the data to make life easier
X_mapped_normed = (X_mapped - X_mapped.mean(axis=0, keepdims=True))/X_mapped.std(axis=0,keepdims=True)

w,b, J_history,_ = gradient_descent(X_mapped_normed, y_train, initial_w, initial_b, 
                                    compute_cost_reg, compute_gradient_reg, 
                                    alpha, iterations, lambda_)

# Unnormalize the weights and bias so that we can use `plot_decision_boundary` as is 
__w = w/X_mapped.std(axis=0)
__b = b - __w @ X_mapped.mean(axis=0)

plot_decision_boundary(__w, __b, X_mapped, y_train)

My result:

Screenshot_20221224_113616

Raymond

I found the picture 5 in the folder of the lab’s files.

Great work! I will test it myself soon. Thank you Raymond!

Dear Raymond, I could generate the overfitting plot with your code setting

  • lambda=0 which sets the regularization term=0 stressing overfitting,
  • iterations = 20000,
  • alpha = 2 and

by normalizing the 27-dimensional feature vector before performing the gradient descent.

The overfitting plot that does not generalize well to new examples:

The warning occurs when the NumPy exp function is used and its value is too large for it to handle.

Hello Daniel @Daniel_Blossey,

How are you? Sorry for getting back late, as it was the Lunar New Year and I had not been active here for a few days.

It’s great that you can produce that plot. As for the warning, it’s likely that some z is too negatively large so that \exp(-z) exploded, but it shouldn’t affect the final result because \frac{1}{1+\exp(-z)} should still give us a 0 which is the expected behavior.

Raymond