Week 3 Programming assignment section 7

When creating a post, please add:

I submitted the programming assignment for week 3 and the autograder gave me 100%, so I decided to go back and play with the NN on other datasets, in section 7.

I noticed that my outputs are non-deterministic. For example, when I set n_h = 5, and then train a model on gaussian_quantiles data set, the first execution of this block gives me the following costs:

Cost after iteration 0: 0.693124
Cost after iteration 1000: 0.068776
Cost after iteration 2000: 0.038353
Cost after iteration 3000: 0.031598
Cost after iteration 4000: 0.027778
Cost after iteration 5000: 0.025106
Cost after iteration 6000: 0.023074
Cost after iteration 7000: 0.021457
Cost after iteration 8000: 0.020128
Cost after iteration 9000: 0.019009

and the 2nd execution (with no changes to the code) outputs different costs

Cost after iteration 0: 0.693122
Cost after iteration 1000: 0.080631
Cost after iteration 2000: 0.058538
Cost after iteration 3000: 0.039226
Cost after iteration 4000: 0.031690
Cost after iteration 5000: 0.028178
Cost after iteration 6000: 0.026181
Cost after iteration 7000: 0.024730
Cost after iteration 8000: nan
Cost after iteration 9000: nan

Any idea what might be the issue?

Here is my code:

# Datasets
noisy_circles, noisy_moons, blobs, gaussian_quantiles, no_structure = load_extra_datasets()

datasets = {"noisy_circles": noisy_circles,
            "noisy_moons": noisy_moons,
            "blobs": blobs,
            "gaussian_quantiles": gaussian_quantiles}

### START CODE HERE ### (choose your dataset)
dataset = "gaussian_quantiles"
visualize_orig = 0

X, Y = datasets[dataset]
X, Y = X.T, Y.reshape(1, Y.shape[0])

# make blobs binary
if dataset == "blobs":
    Y = Y%2

if visualize_orig:
    # Visualize the data
    plt.scatter(X[0, :], X[1, :], c=Y, s=40, cmap=plt.cm.Spectral);
    # set hidden layers
    hidden_nodes = 5
    # Build a model with a n_h-dimensional hidden layer
    parameters = nn_model(X, Y, hidden_nodes, num_iterations = 10000, print_cost=True)

    # Plot the decision boundary
    plot_decision_boundary(lambda x: predict(parameters, x.T), X, Y)
    plt.title("Decision Boundary for hidden layer size " + str(hidden_nodes))

Hi @ldz ,

Add a print statement after print statement for cost in the nn_model() to find out the values of A2 might give some insights.

When you ran the notebook the second time, did you restart the kernel first?

If you just run the training portion of the notebook a second time, the initial weights of the NN layers may not be re-initialized. In that case, you’d be just be fine-tuning the solution you got the first time, rather than starting over.

Welcome to the wonderful world of cost functions that have local minima.

The NN cost function isn’t convex, so you could get different results each time you train the model.


Very clear, thank you! I did not restart the kernel so when I did that, things are much more deterministic, thanks!

1 Like