We cannot understand the reason behind the name comes for hidden layer? can u please help to understand what does it mean the statement hidden layer are not seen in the training set means?
Below content from lecture video start from 1:06 mt
So the term hidden layer refers to the fact that in the training set,
the true values for these nodes in the middle are not observed.That is, you don’t see what they should be in the training set.You see what the inputs are.You see what the output should be.But the things in the hidden layer are not seen in the training set.So that kind of explains the name hidden layer
Hi @Anbu. You can understand this terminology by considering the inputs and outputs of the layers.
This initial “layer” of the network (L = 0) has the observed features as inputs. In the binary classification network, i.e., the one in which we wish to predict whether an image is that of a cat or not, the inputs are numerical representations of each individual “input” image. Each number in that input vector represents an “intensity” of each pixel in the image (in each of the red, green, blue channels). These are observed; they are data.
When passed into the network (i.e. they are “fed forward”), this vector is operated on by the weights (in
b) and are passed into an activation function (e. g., ReLU, tanh) to get the “output” of the initial layer. Since the inputs are data with an ordinary human-level interpretation, they are not “hidden”; you can explain to me, and anybody else, what the mean. Notice that this is not the case for the output of that layer. Other than to describe the type of mathematical transformation the inputs underwent to arrive at those values, it is difficult to ascertain the meaning of that information.
These outputs are then passed into the first layer (L=1) where they undergo a similar mathematical transformation and the outputs will be “hidden” from any easily-interpreted meaning, i.e. they are not data from real-world measurement. So both the inputs and the outputs of this layer lack such easy interpretation. As such, this is the first truly hidden layer. All subsequent layers preceding the final layer are similarly “hidden” in this sense. We can, if we wish, see the output as numbers–so in that sense they are not hidden–but we are mostly not interested in their values.
The last layer of the network (layer-L) outputs a quantity that is amenable to interpretation, of course. In a classification task, it outputs a probability that the input image is that of a particular object (e.g. a cat) or not. The output of this final layer then, does indeed have a nice human interpretation or understanding. Accordingly, it is not a hidden layer.
I hope this helps!
Thank You sir @kenb for providing much effort answers