Week 3 Programming Assignment Exercise 3 Error

prxkhxr13 · June 16, 2021, 11:22am

Inside the function, intitialize_parameters(), I have initialized W1,b1,W2,b2 correctly. Still it is giving me error.

Please clarify.
Thanks!

nguyenminhquan · June 16, 2021, 12:50pm

I have the same issue. I noticed that the rand function from numpy.random only give random number between 0 and 1, so if scaled to 0.01, it can only give numbers between 0 and 0.01, but in the “Expected results”, we can observe negatives values (ex: -0.0041675) and value greater than 0.01 (ex: 0.01640271). So I doubt that there are a technical issue here. Probably the range is set differently?

Nikitha · June 16, 2021, 1:12pm

Try randn
W1 = np.random.randn(n_h, n_x) *0.01

paulinpaloalto · June 16, 2021, 1:27pm

Yes, @Nikitha has the answer. The instructions are quite clear on this: they literally wrote out the correct code for you using “randn”. If you look up the two functions, you’ll find that “rand” is the Uniform distribution on (0,1). “randn” gives you a Normal Distribution (Gaussian) with mean of 0 and standard deviation of 1, so it gives both positive and negative values with absolute value mostly < 3. So using a different distribution gives you different values.

annena · October 11, 2021, 3:50pm

Regarding the code to initialize the weight matrices, I hope I did not miss anything, but is there any particular reason why we need to multiply the np.random output with 0.01?

paulinpaloalto · October 11, 2021, 5:31pm

Yes! It turns out that there is some advantage to starting with relatively small values of the initial weights. If you use larger values, you can have problems with “saturating” the values of the sigmoid function so that they come out to be exactly 1 or exactly 0. Of course mathematically they are never exactly 0 or 1, but we are dealing with the pathetic limitations of the finite floating point representations here. If you get 1 as the \hat{y} value, then you end up taking the logarithm of 0 and getting Inf or NaN as the cost value.

I think Prof Ng must say something about that in the lectures here in Course 1, but I forget exactly what he says on that point. Of course I’m sure you picked up on the fact that we can’t just use 0 as the initial values, because we need to “break symmetry”. Prof Ng does mention that in the lectures, but doesn’t really prove it. Here’s a thread which discusses why symmetry breaking is required in more detail.

It also turns out that there are more sophisticated ways to do initialization than just multiplying by 0.01. We will learn about techniques like Xavier and He Initialization in Course 2 of this series, so please “hold that thought” and stay tuned for Course 2.

Topic		Replies	Views
Can the random initialization of weights return very small values using np.random.randn((x,y))*0.001? Neural Networks and Deep Learning	3	692	September 28, 2021
W4_A1_Ex2 - Error in notebook evaluation (random initialization) Neural Networks and Deep Learning week-4	5	156	May 3, 2024
W 3 \| A1 \| Ex-3- Initialize Parameters\| Problem with np.random.seed Neural Networks and Deep Learning	4	565	February 26, 2024
Week 4 Assignment 1 Exercise 3.1 Initialize_parameters Neural Networks and Deep Learning	5	616	December 1, 2021
Course 2 week 1 random initialization Improving Deep Neural Networks: Hyperparameter tun	3	634	July 30, 2022

Week 3 Programming Assignment Exercise 3 Error

Related topics