Questions_ About Max Pooling

Dear DeepLearning Friends,

I have a question regardung max pooling of CNN. If give a high level look, max pooling works only when the feature detector catch the right pictures in the first step, do I understand correctlhy?

The reason I ask this is, for example, what if the original picture rotation 60 degree, or scaled unporportionally? if feature detector does not make corresponding adjustment, it wouldn’t give the right feature map, the max pooling from the beginning will be wrong with incorrect data. In that condition, If I understand correctly, that means in CNN model, usually there’s another layer for making adjustment to match feature detectors with images (right angel, right proportion, etc.) correctly in the training pool.

I am a designer who explore ML models from a design perspective. Your Insights will be super valuable to me. I look forward to the response! Thank you!

I’m not completely sure I understand your question, but here are some thoughts:

I think the key point here is that the “feature detectors” are the conv layers and the whole point is that those are trainable. That’s what is happening during training: we are using back prop to push the weight values in all the filters in such a way that the feature detection actually works with whatever your data is and what it is you are trying to identify, as defined by your labels and your loss function. The typical pattern in ConvNet architectures is that you have a number of conv layers followed by a pooling layer and that pattern is typically repeated. Note that the pooling layers (either max or avg) do not have any trainable parameters: they are just a way to do “data reduction”. But they still play a role in back propagation: the gradients are passed back through the pooling layers as well. The whole network plays a part in the training, even the pooling layers. The pooling layers affect the results of the training even if the pooling layers themselves are not modified by the training. If you take DLS Course 4, you’ll get to see how the back propagation works with the pooling layers in the “Building a ConvNet Step by Step” assignment in Week 1 of DLS C4.

1 Like

Hi Paul,

Thank you so much for the great answer!

My question actually could be narrow down as: if the source images are distorted, unproportional scales, or rotated, should we modify or add more feature detectors to generate the feature map for relatively correct data to the next step: max pooling.

My concern is even if there’s the back propagation to update the weights, however, the pooling layers are not modified by the training, what if the source data (feature maps for max pooling) is off, will the model still can give the right judgement through the training? Or there’re some other layers to add/modify feature detectors correspondingly (such as rotating the feature detector the same degree based on the original image, scale feature detector unproportionally according to the original image) to generate relatively accurate feature maps?

I looked at the course, however, I couldn’t access. Is there’s an entry for purchasing it or access to it?

Again, thank you so much for the great insights! I look forward to more of your insights!

Sincerely,

Avery Wang

https://www.youtube.com/watch?v=r0BT1Gc2Hcg. I found this on Youtube. I think it proves my concerns: getting feature detector in different angles, as long as there are enough feature detectors with different angles, it can give the answer we want.

The pooling layers themselves are not modified by the training, but the point is that all the conv layers are modified by the training and that the training incorporates the behavior of the pooling layers, as I described above. Of course there is no guarantee that you have selected the right network architecture to solve a given problem. If you pick an architecture and then try training it and the prediction accuracy is not sufficient, you may need to add more layers or modify the architecture in some other way. Unfortunately there is no magic “silver bullet” network architecture that is guaranteed to work in all cases.

The DLS sequence of courses is available from DeepLearning.AI on Coursera. If you search Coursera for “Deep Learning Specialization”, you should be able to find it and sign up for it.

Dear Paul,

Got it! Thank you sooo much! I think now I understand it in a more clear way! Appreciated it a lot! For sure, I will look at the courses in Coursera. If I have other questions, I hope I can obtain more expert insights from you!

Thank you so much again!

Sincerely,

Avery Wang

1 Like