Image cropping and extending

jlammens · July 21, 2023, 12:09pm

The image cropping code extends images if necessary with all zero (black) pixels. Might the classifier using these cropped and extended images work better if extensions used gray rather than black pixels? Gray ought to be statistically close to average image pixel values.

Jan_Ravnik · July 24, 2023, 8:29am

Hi @jlammens,

Thank you for the question. This is indeed an interesting one.

My gut feeling says it would probably be the same no matter what you use, as the neural network would just learn to disregard it. This said, you have to be careful that you use the same background in the training data as in the test data, because using a new type of background on which it was not trained on might easily confuse it.

I was curious about your question and did a bit of a search through the literature. I couldnt find any direct comparison of what you are asking (even though I assume if gray was better, people would already use it), but I found this one, which might be interesting:

They compared rescaling the image to padding it. Their main conclusion regarding padding is as follows:

“Our study showed that zero-padding had no effect on the classification accuracy but considerably reduced the training time. The reason is that neighboring zero input units (pixels) will not activate their corresponding convolutional unit in the next layer.”

So if what they claim is correct, black padding doesn’t make it worse, but it makes it faster. And speed is in many cases a very important factor. Of course every case is slightly different and in our case we are actually doing both: rescaling and padding, which might complicate the theoretical understanding a bit.

If you are curious and have the capacity to dig deep into this issue, you can try it yourself. You can pad the data with the mean value, retrain the network and observe the results. In light of the above article, perhaps you can also try rescaling in a different way (maybe no upscaling, but just downsizing when necessary). People have also tried fake backgrounds, removing backgrounds, etc. The possibilities are many.

All these small nuances are what makes this field so interesting (and not yet completely discovered and understood) and there are always countless possibilities to improve the algorithms. Some of them may do wonders, others make it worse. It is up to us to try.

jlammens · July 24, 2023, 8:58am

Interesting, thanks for your reply! Zero values (black RGB pixels) indeed imply no activation for the corresponding input ‘neurons’. For what it’s worth, this is actually the inverse of what happens in human retinas, where photoreceptors use ‘negative’ or ‘inverted’ coding with activation being inversely proportional to amount of received light…

Jan_Ravnik · July 31, 2023, 8:01am

That’s an interesting fact! Thanks for sharing.

Topic		Replies	Views
Convolutions: why pad with 0s instead of "repeat"? Convolutional Neural Networks coursera-platform	2	527	April 18, 2023
Color shifting vs grayscale Convolutional Neural Networks coursera-platform	4	833	June 7, 2022
Is the size of the image matter on production? Convolutional Neural Networks in TensorFlow week-module-1	1	517	February 23, 2022
Error in Running the 1st exercise of course 4 week 1 assignment 2 Convolutional Neural Networks coursera-platform	1	644	June 8, 2021
Course 4/ Week 4/ Programming Assignment: Face recognition Convolutional Neural Networks coursera-platform	1	527	January 24, 2022

Image cropping and extending

Related topics