CNN learning with some hand-crafted filters

The examples of hand-crafted filters show some simple filters like vertical edge and horizontal edge detection. I wonder if there are advantages to building a CNN with some of the simple/easy hand-crafted filters coded into the system and letting the system learn complementary filters?

I guess many people have had a similar thought and I’d like to hear about some experimental results.

Hey @MarkNZed,
I believe you can find a couple of research works focusing on initializing the CNNs with known filters or in other words, hand-crafted filters. For instance, a quick google search led me to this work.

They have initialized a sample CNN with Gabor filters, and gave some results concluding that Gabor-filter based initialization has a similar effect to Transfer Learning. You can find many other pieces of similar work by searching on Google Scholar.

Also, depending upon whether you want your hand-crafted filters to stay fixed or you want to further fine-tune them, you can find the different works accordingly. I hope this helps.

Cheers,
Elemento

Hi Elemento,

Thanks for taking a look into this. The paper seems to replace the low level filters rather than complement them. I suspect that is going to suffer from relying on humans deciding on what the filters should be. I’d like to hear of results where there is a mix of hand-crafted and learnt low level filters. The basic idea is to have fixed hand-crafted filters and randomly initialized learnable filters.

Cheers,
Mark

Hey @MarkNZed,
What exactly do you refer to when you say “complement”? Does it mean that the hand-crafted filters are fixed or are tunable? If they are fixed, won’t they trivially “suffer from relying on humans deciding on what the filters should be”, and if they are tunable, then isn’t it just another initialization scheme?

Cheers,
Elemento

I think Andrew presents the material in this way (presenting edge detectors as an example of a low-level filter) because it gives a good intuition about how the features evolve after processing by each layer.

In practice, I don’t think that edge detection results in the lowest cost solution, so typically that’s not what a trained CNN implements.

Hi Elemento,

To complement means “a thing that contributes extra features to something else in such a way as to improve or emphasize its quality”. The hand-crafted filters would be fixed but there would also be filters that would be tunable/learnt. The idea is to mix both approaches.

Cheers,
Mark

1 Like

Hi TMosh,

I’m not sure what you mean exactly by “lowest cost” but I guess you are referring to the lowest error after training ? Do you understand why the system would not perform at least as well if the number of parameters are the same in both systems (one with some hand-crafted and one without any hand-crafted). I guess a system with some hand-crafted filters may have advantages in some scenarios. It would be interesting to hear about empirical results or theoretical justifications.

Regards,
Mark

It does seem that mixing the two approaches might be more effective e.g. Comparison of handcrafted features and convolutional neural networks for liver MR image adequacy assessment | Scientific Reports “Using HC features outperforms the CNN for smaller sample sizes and with increased interpretability. On the other hand, with enough training data, the combined classifier outperforms the models trained with HC features or CNN features alone.”

Here too [1801.07848] Feeding Hand-Crafted Features for Enhancing the Performance of Convolutional Neural Networks “In this paper, we show that finding an appropriate feature for the given problem may be still important as they can enhance the performance of CNN-based algorithms.”

Hey @MarkNZed,
If we consider the hand-crafted filters, which are fixed in your said approach, plus the tunable filters, then don’t you think one of the downsides for the fixed filters would be what you stated before, i.e., “suffer from relying on humans deciding on what the filters should be”.

How about if we train the network from scratch (consisting of all tunable filters), but we also feed the hand-crafted features along with the input images to the tunable network? I don’t think this is much different from what you wanted to achieve, just a different perspective, what do you think about this?

Cheers,
Elemento

Hi Elemento,

The system seems less likely to "suffer from relying on humans deciding on what the filters should be” because the system will learn filters that reduce the impact of poor human decisions in the HC filters.

It would be interesting to try both approaches and compare. I think there would be a difference because the HC filters are either in layer 0 or layer 1. Having the HC filters in layer 0 is similar to having no HC filters in the input and only HC filters in layer 1 and skip connections from layer 0 (the inputs) to layer 2.

Cheers,
Mark

Hey @MarkNZed,
Sure, and do share your results with the community.

Cheers,
Elemento