Week 3 Exercise 8 - nn_model - Planar_data_classification_with_one_hidden_layer


Good morning everybody!
I can’t figure out why I’m getting all those nan and my parameters are that huge.
I found similar topics on the forum but their solutions don’t seem to work for me (or I’ve not been able to implement them correctly). Everything in previous exercises works fine, I updated the parameters like this
W1 = W1 - …
instead of
W1 -= …
It turns out that if I hard code the learning_rate in update_parameters function, all those nan disappear, but still I can’t get correct values for W1. The only thing that came to my mind is that I did something wrong with the copy.deepcopy function, but I can’t figure out what else to do rather than W1 = copy.deepcopy(parameters[“W1”]). Otherwise, I have no idea about what the problem could be. Does anyone have any advice? Thank you very much!

It sounds like it must have something to do with the learning rate from your description. You should not have to do anything with it: they end up just using the default value that is defined in the declaration of the update_parameters function.

The one other thing to check is to make sure your “initialize” function is correct. Are you sure that you multiplied the output values by 0.01 as they specified? Of course if you did not, I would expect the tests for that function to fail.

Thank you for your support. Yes, I checked the initialization function and everything seems correct.

Here is a screenshot with learning_rate=0.001, if it can be useful to understand cost function and weights behaviour. Clearly, if I use a smaller learning rate the cost decrease less and it stays around 0.67 and also W1 values seem to become smaller. Don’t know if this can be a useful information

But the higher level point is that you should not need to manipulate the learning rate at all in order for this to work. So there is something else wrong that we need to figure out. I’ll contact you by DM to discuss more options.