Questions on Exercise 7&8 of Week3 Programming Assignment

Ray_L · April 2, 2024, 3:18am

Hi there,
I’m a bit confused about the learning rate in the update_parameters function. The default value is set to 1.2, which looks like a relatively high number, but isn’t it supposed to be a smaller value like 0.1, 0.01, or something?

Subsequently, I’m facing the divergence problem in Exercise 8, where my model could not converge.

I tried a few different smaller alphas, and ended up with convergent results, but they just could not match the expected one.

It would be much appreciated if someone could help me with this problem.

TMosh · April 2, 2024, 3:49am

Not necessarily. It depends on the magnitude of the features.

If your solution doesn’t converge, it could be any of several issues (general topics, not necessarily applicable to this assignment):

Your code for the gradients isn’t correct.
The learning rate is too high.
The features need to be normalized.

For your code to work correctly, you only need to follow the instructions in the notebook.

You don’t need to do anything inventive or surprising that isn’t mentioned in the notebook instructions.

Ray_L · April 2, 2024, 4:43am

Hi Tom,

Thanks for the clarification of the learning rate.

Regarding the divergence problem, I did follow the instructions in the notebook. Before Exercise 8, everything went well and all the results were matched, meaning all the previous functions passed the test, but when combined in the model, the problem emerged. That didn’t add up because if any function was wrongly defined, I should have seen it after its corresponding test function.

Any idea what else could possibly cause the problem? Based on your advice, the learning rate was the default (not the problem), and the data was directly loaded (features must be well designed, not the problem), so the reason might be the gradient function?

TMosh · April 2, 2024, 5:22am

Note that passing the unit tests in the notebook does not prove your code is perfect. The unit tests only check a few specific conditions.

Ray_L · April 3, 2024, 6:27am

Hi Tom,

Thank you for your answer. Yet I still could not pinpoint my issue here.

Basically, in the nn_model, these 4 functions were called one by one:

{mentor edit: code removed}

And before the iteration, the cost at i=0 matched the result, meaning forward_propagation and compute_cost should be correct. So something might be wrong with the backward_propagation function, but in that function, I just followed the 6 vectorized equations to compute dZ2, dW2, db2, dZ1, dW1 and db1.

In the test unit, the backward_propagation_test_case first set the input variables, and these variables went into my backward_propagation function to generate the output, which matched the results. Next, backward_propagation_test initialized a different set of input variables, and the results were correct again. So backward_propagation was double-checked and passed but failed to converge in the model test.

Now, I’ve no idea where the problem could possibly exist…

Ray_L · April 6, 2024, 12:43pm

The problem turned out to be a typo of “grads”…

Topic		Replies	Views
Week 3 exercise 8_nn_model Neural Networks and Deep Learning coursera-platform	7	625	August 11, 2023
C2_W3_Assignment Exercise 7 (nn_model, extra check) Calculus for Machine Learning and Data Science week-module-3	3	28	September 13, 2024
Week3 Course1 Programming Assignment exercise 8- nn_model Neural Networks and Deep Learning coursera-platform	2	592	December 25, 2021
Week 2 Exercise 8-Incorrectly Classified? Neural Networks and Deep Learning coursera-platform	1	583	June 25, 2021
DLS Course 1, week 4, Deep NN Application, second exercise wrong Neural Networks and Deep Learning coursera-platform	6	576	April 11, 2022

Questions on Exercise 7&8 of Week3 Programming Assignment

Related topics