NN Dot products - Forward Propagation Video

In the Forward propagation video, the output of a1[2] is g(w1[2] . vector[a1] + b1)

Is our case a[1] contains 25 unites and a[2] is 15 units.

How does the dot product between w1[2] and vector[a1] happen?

Is w1[2] also a vector of 25 units? Making w[2] a matrix of 25 by 15? Im a bit confused on this part.

I think you are talking about this slide:

w_1^{[2]} is the weight vector of ONE unit in layer 2, so it is NOT a matrix, but a vector, and if a^{[1]} has 25 units, then w_1^{[2]} is a vector of 25 weights.

if a^{[2]} has 15 units, it means you have 15 units in layer 2, which means you have w_1^{[2]}, w_2^{[2]}, w_3^{[2]}, … w_{15}^{[2]}, and every one of them is a vector of 25 weights.

w_1^{[2]} is NOT a Matrix.

Raymond

Hi Raymond,

in your reply above, you say "if a[2] has 15 units, it means you have 15 units in layer 2, which means you have w2_1, w2_2, w2_3…w2_25, and every one of them is a vector of 25 weights (I have used the notation w2_1 to represent w with superscript [2] and subscript 1).
Should this not run to w2_15 rather than w2_25? Please could you explain if I have misunderstood?

In Neural Network Implementation (MLS course 2 week 1) in Python, Prof Ng’s discussion is very helpful, though because it uses a very simple example with X only having a single input example of n=2 features, I am not certain how this generalises.

Can generalisation be summarised as follows:

if X is a 2D array/matrix of m input examples each of n features,

i. for each neuron/unit in the first layer, there will be a vector/1D array consisting of n x w values, and a separate 1D vector b

iii. for the layer as a whole, these neuron/unit w vectors are combined/concatenated into W, which has dimensions n x the number of neurons/units.

iii. the output of each neuron in a layer will be a single value between 0 and 1 (post sigmoid function),

iv. for each layer, the output (a[n]) will consist of a vector/1D array with dimensions equal to 1 x the number of neurons in the layer.

v. for layers 2 and onwards, each neuron in the layer will receive the preceding a vector, of dimensions 1 x (no. of neurons in previous layer). So in these layers, each neuron will also have a vector/1D array of n x w values, and a separate 1D vector b.

P.S. I’d be grateful if you could direct me to where to find out how to write with superscript/subscript text and scientific notation in this forum.

Jem

Hello Jem,

It was my typo. I have corrected it.

For each neuron, there will be a vector of n weights, and one scalar value of bias.
For the layer, there will be a matrix W of n \times n^{[1]} weights, and a vector of n^{[1]} bias. Note that n^{[1]} is the number of neurons in the layer.

Yes, given that, as you said, a sigmoid is used as the activation for the layer. However, a layer does not have to use sigmoid as its activation. Sigmoid is our choice but not a requirement.

Yes. And I have used n^{[l]} to denote the number of neurons in layer l.

Yes.

See my above response for how many weights are there for each neuron. Let me know if you are not sure about it.

We can write Math expression in inline mode by wrapping it with two dollar signs. For example $x_{12}$ will give us x_{12}. For more on the syntax, see this Wikipedia page and you will start seeing some familiar use cases from Section 4 on.

Cheers,
Raymond

Many thanks Raymond - that’s really helpful.

Jem

You are welcome Jem @Jem_Lane :slight_smile:

Raymond