W4_How do we know what was learned?

Hi, Irene.

These are great questions worth discussing! I totally agree that it seems like there is some magic going on here. I’m not claiming to be able to give complete answers to your questions, but can suggest some resources to explore further.

For starters, note that the picture they show of how the image is unrolled is actually not how we are doing it. If you dig into the details, the method of “flattening” that they give us unrolls the pixels such that the R, G and B pixels for a given position in the image are all adjacent in the array. Here’s a thread which digs into the details of that. If you read all the way through the thread, it later discusses how you could do the “unrolling” in the other way (all the red pixels first, followed by all the blue pixels and so forth). And the interesting thing is that the algorithm can still learn to detect the patterns in the images with either style of unrolling. But it is crucial that you are consistent in the method you use: if you mix the types of unrolling, then you get garbage and nothing works. You can actually run the experiments and prove to yourself that it works equally either way as long as you are consistent.

It does seem surprising and counterintuitive that the unrolling does not destroy the algorithm’s ability to learn to recognize the geometric patterns in the images. At one level, we just have to believe it from the results, but it is worth thinking about whether you could construct some experiments to try to figure out what is happening in the internal layers of the network. In Course 4 of DLS, which covers Convolutional Neural Networks, Prof Ng will show us some really interesting work where researchers did exactly that. The lecture is called “What are Deep ConvNets Learning?” and the video is available on YouTube. (It’s in Week 4 of DLS Course 4.) Even if you haven’t yet learned about ConvNets, you will get the idea of what he’s describing and some intuition from that lecture. Of course ConvNets are more powerful than the networks we are learning about here in Course 1 because they actually can deal with images in their original spatial form and work by stepping smaller “filters” across and down the images. So with ConvNets, it’s a bit more intuitive to see why it can detect the same pattern at any position in an image.

But even without pre-knowledge of ConvNets, the ideas in the lecture about how you could instrument the neurons in the hidden layers of a network and then feed images through and get some idea of which patterns trigger the largest response from a given neuron are interesting. You can think about how that could applied to the simpler fully connected networks we are studying here. I have not done any searching to find out if there are any papers about that.

1 Like