What are deep ConvNets learning?

I didn’t fully understand what Prof Andrew means by hidden unit and hidden unit activations . and what is the dimension of hidden unit activations ?
and how can we plot it as an image? also how to get the max unit activation ?
I know it is a lot but I hope you get my point .
Thank you in advance

All NNs have hidden layers. It is a fundamental characteristic. All units have an activation function. If the input data is images, the NN hidden layer weights can be viewed as images.

Okay I understand that well but my point is what is the meaning by one neuron or one hidden unit in a convolutional layer, does one neuron represent one channel of the resultant volume after a convolutional operation, or am I wrong

What Prof Ng is explaining in this lecture is some really interesting work that tries to get a concrete idea about what is actually being learned by the inner hidden layers of a ConvNet. As Tom said, each neuron at each layer has an activation function applied to it. What the research here is doing is picking some individual output neurons in a given hidden layer in the network and then instrumenting it by tracking the activation output values for that one neuron. Then what they do is show you what the input value was that triggered the maximum activation output value from that neuron across all the input samples that they fed through the network. In other words, the result shows what input pattern caused the strongest “reaction” by that one particular neuron. And the results of all this research seem to show that individual neurons at a given layer get trained to recognize different but distinct patterns in the input.

The other general thing to observe here is that none of this is “planned” in advance: we just start with a given network architecture, we randomly initialize all the filter and bias values at all layers and then we just run back propagation and see what happens. Because of the random initialization, we get “symmetry breaking” and different neurons just happen to learn to recognize different patterns. And if we ran the training again with different random initializations, what might happen is that the same patterns get learned by different neurons in the new training run. The same patterns are there in the input, but the way the learning takes place is affected by the initialization. Mind you, this is just my intuition: I didn’t really read the paper that Prof Ng is describing here.

2 Likes

I really appreciate your explanation thanks a lot

@paulinpaloalto
thank you for your explanation, but i still don’t get what is a hidden unit in a convnet
does one filter mean a hidden unit ?
-like in the image that i attached the first layer results in an 11011096 output
so is the hidden unit here means the filter that resulted in one of the 96 channels or one of the 96 channels itself
-and what is the meaning of the 9 image patches that maximized the unit output, does it mean the whole image or a part of the image
i know that is a lot of confusion but i really struglled to figure it out and failed ):

1 Like

Think about how a convnet works: you are passing a filter of a fixed size over the image, based on the filter size, stride value and image size. For the sake of the example, let’s suppose that the filter size is 5 x 5. So at that layer, each output value (each element of given output channel) is the result of doing the linear combination of that filter with a particular point in the input image (across all the input channels), adding a bias term and then applying the activation function. So each of those 9 “patches” are (in this example) a 5 x 5 portion of the input image at that layer.

1 Like

I like your explanation which is so clear!
That answered what I was not able to understand : where do those demonstrated images come from.

So in the lecture video, a hidden unit is actually one filter, and the demonstrated images are the portions of the input images, that cause the selected filter(hidden unit) to output max activation output.

I hope the answer from paulinpaloalto can be added to the lecture notes for clarifying where are those images from. Otherwise there may be more people like me, misunderstood the images ARE the activation output values.

Hi, Keith.

I’m glad the information on this thread was helpful. Just one further clarification:

One hidden unit (one hidden neuron) is just one element of a particular output channel created by one filter.

I will ask the course staff if there is a way to add more information to the lecture notes.

1 Like

Thanks for correcting my misunderstanding, I think I feel relieved after reaching this post :sunny: