Questions on Week 3 "Neural Network Overview"

Hello everyone, I would like to ask a question regarding the “Overview of the Neural Network” course discussed in Week 3. In the picture below, I do not understand why Z^[2] is a multiplication of W^[2] and a^[1], not X. Also, is there any reason why we are not doing a transpose of the W matrices? Thank you for your patience.

1 Like

We don’t need a transpose because of the way Prof Ng has defined the W matrices. That is explained in this thread.

The point about the input values is that it depends on which layer you are talking about: in layer 1, the input is X, but in layer 2, the whole point of neural networks is that the input of any layer other than the first layer is the output of the previous layer, which is the A value from layer 1.

Also note that you filed this under DLS Course 4 ConvNets, so I moved the thread to DLS Course 1.

1 Like


Just to add to what Paul said (and though this might not be a 100% technically correct), the way I like to think of it is: You haven’t totally ‘lost’ the original data, rather you train on the activations because the network is trying to pull out the features that have the most effect, or in the end will minimize your cost loss upon classification/evaluation/back prop, and it is this subset of information that is ‘percolating down’ through your network.

Anything less and you’d just be doing only a ‘straight up’ regression (i.e. W.Tx + b).