There is one question is the quiz that I can’t really justify the answer for. I’d like to know what I’m missing.

The question involves picking options, one of which is: w(superscript)[4](subscript)3 is the row vector of parameters of the fourth layer and third neuron.

I selected this but apparently it’s wrong: it’s a column vector.

Why is this the case? We were taught that in the W matrix each row corresponds a neuron and each column to an input layer.

So wouldn’t it be the case that the third neuron would have a slice of parameters across the columns, i.e. it would be a row vector? (1 x n)

Refer this video again where Prof. Ng explains superscript square bracket number is column vector

A column vector is an nx1 matrix because it always has 1 column and some number of rows.
A row vector is a 1xn matrix, as it has 1 row and some number of columns.

one needs to understand input layer is column vector [4] and if that has number of examples that is represent with ( ) which can be row vector.

So the question where you are asking the superscript is [4 ] represent the input layer which is a column vector.

Hi @Deepti_Prasad. Actually, it seems like the video after that one (“Computing a Neural Network’s Output”) mentions it in more detail. But thanks for linking that - I needed to pay more attention!

It isn’t really explained why this is the convention, but it seems like the rows of W are the transpose of the parameter vectors, which are column vectors. Thus in W, the row belonging to neuron 3 is a row vector, but it is actually the transpose of the original column vector w.

I think the confusion came from the below statement Prof. Ng mentions in the Computing a Neural Network’s output where he mentions

The first step and think of as the left half of this node, it computes z equals w transpose x plus b,

But one needs to understand the option w superscript[4], subscript(3) is explaining neuron which is a column vector where as W(l) is matrix with row equal to transpose of parameter vectors.

If you notice in the same question there is another option about W (one is capital W another small w, and both here different)

Yes that makes sense. So you are saying that WX is actually computed as (W^T)X, which explains why the individual rows of W in the latter case are actually column vectors?