The function initialize_parameters_deep takes layer_dims as its input.
For the example layer_dims is [5,4,3]. So, L =3.
Therefore the function should create and initialize W1, W2 and W3; and b1, b2, b3.
But obviously the for loop range (1,L) only runs twice; so it only creates W1, W2 and b1, b2.
Why doesn’t it create W3 and b3 like the comments at the beginning say?
What am I missing?
If I understand your question correctly:
- The weight matrices connect the adjacent pairs of layers.
- So if you have three layers (input, hidden, output), you only have two weight matrices.
Thank you for the quick reply. Here is my confusion:
- In the videos, the layers (input, hidden, output) would be numbered with input = 0, hidden = 1, output = 2 so a network with input, hidden, output would have L = 2 (See “Deep L-layer neural network” Deep Neural Network Notation).
2). In the example in the assignment with layer_dims [5,4,3], L is 3 and the comments in the function “initialize_parameters_deep” say that it will create parameters W1,b1,…, WL,bL, so perhaps the comments should say that it will create parameters W1,b1,… W(L-1),b(L-1). So, I guess it’s just a little inconsistency with how layers are numbered. Is this correct?
Sorry for being picky about details, but it is important for my understanding.
The point is that the “input” is not considered a “layer” in Prof Ng’s terminology, but of course you still need to know the number of elements in the input. The first W and b values are the ones that map from the input to the output of the first layer. So the number of elements in the layer_sizes
array is one greater than the actual number of layers. In the example you gave, there are 3 elements in the array, which means there is one hidden layer and one output layer in the network that is being defined by that array.
Thank you. This has been resolved.