Week4: What are ConvNets learning? - Understanding the patches

H​i :slight_smile:

I’m referring to the second video of “Neural Style Transfer” (Week 4). I didn’t quite understand how these patches (at around 1:05) were generated.

  • W​hat is “hidden unit” here? one filter?
  • And what are the image patches? a portion of the input image?
  • What exactly is plotted?

T​his is what I understand:

Y​ou go through your input images, and for each such image you go through every image patch the filter is activated on. You pick 9 input image patches (there could be multiple such patches from the same image) that yield the maximal results of g(Wx+b) for a certain filter (in the video, the filter is called “hidden unit”), where g is the activation function of that layer. What we see is drawn is the input images patches that yielded the highest g(Wx+b).

Am I correct?


Hi Raz,

Sorry for the late response but I find interesting your post. I would say that all the answers are on the scientific paper referenced at the video:


I did not read the paper but my guess and the way I understood the explanation is aligned with yours. I mean the images “patches” are the parts (pixels?) of the image that are activating (maximising) each corresponding unit. Thus each unit is kind of “learning” some specific image feature that could be detecting edges for instance.

Happy learning


Hi, Raz,

Great to have you in our community!

Besides, what Rosa has provided against your query. Here’s a link that could elaborate the things in a more profound way.