W3_Assignment_Doubts

Hello,

A. For ‘Defining the NN Structure’ and layer_sizes(X, Y) function, I wrote the correct code by following the instructions (gave the correct arguments as demanded by the comments).

BUT my doubt is:

  1. Why is n_x (Size of input layer) and n_y(Size of output layer) correspond to no. of rows in the input matrix X (input/independent variable) and matrix Y (output/dependent variable)?

Should it not be no. of columns? i.e. 30 in this assignment?
Using the make_regression function of sklearn.datasets module we defined X & Y for 30 data points earlier.

B. For the initialize_parameters (n_x, n_y) function,

  1. Why are we multiplying 0.01 to the randomly generated value for the weight?
  2. Why is bias being taken as zero? Are we going to add to it the required threshold when we actually use it for some prediction?

Hi @Debatreyo_Roy ,

For point A
If you refer back to section 1.2, it explained clearly that an NN structure is made up of nodes. There are input nodes and output nodes, in addition, hidden nodes. Those input nodes form the input layer, output nodes form the output layer and hidden nodes form the hidden layers. Depending on the structure of your design in solving a particular problem, you can have 2 nodes in the input layer, 1 node in the output layer and 3 nodes in 1 hidden layer. But in our case, a simplest form is 1 input node and 1 output node, no hidden nodes; the number of training examples is 30. So for input matrix X, the shape is (1,30), 1 node and 30 training examples.

For point B

  1. Here, 0.01 is referred to as the learning rate, it is used to help with the speed(in terms of determining how big a step to take in doing gradient descent) when training the network
  2. The bias is set to zero, meaning it is not use in this case.

As this lab is to show you what a basic network structure looks like. So some of the parameters such as bias is set to zero to simplify the calculation.

Here is a link to an article that may help.