My thoughts on CNNs

Matthias_Kleine · February 12, 2023, 10:11am

Dear co-learners and mentors,

after finishing 3/4 of the CNN course of this specialization, I want to share some thoughts on the topic.

First of all, I have to admit that somewhen during this course I really lost focus and some motivation, which has not been the case during any other MLS or DLS courses before. So I asked myself why.

I came to the conclusion that somehow all the CNN stuff seems to be less “systematic” and “explainable” then all the topics before. The described architectures sometimes seem to come “out of nowhere”: They just have been proposed in some paper and have proven to work well. But why do they work well? In most cases, there seems to be only some “plausible” explanation for why these architectures work. This is a big difference to mathematical explanation or even proofs. Sometimes, even Andrew seems to say: “Don’t think too much about why this works, it just has proven to work fine and nobody really knows why.”

Having worked as a software engineer, I am used to try to apply systematic approaches to build software systems. This is an engineering approach: Instead of doing some “vodoo” and “trial and error”, you want to have some systematic approach from which you know you will come to a result and why. With CNNs, we seem to be back in the “vodoo” stage: Put some frog legs and some plants in the water, dance around and after some hours, the medicine will be ready.

Having build up my knowledge slowly from probability theory to statistics to classical Machine Learning and then Deep Learning, I now have, for the first time the impression that the whole method has lost “connection” with mathematics. At least it seems that nobody really understands fully, why all these architectures really work. This is somehow disappointing or at lease currently, it feels like it.

I just wanted to share these thoughts. How did you experience the CNN course? What do you think?

Best regards
Matthias

Christian_Simonis · February 12, 2023, 10:40am

Hi Matthias,

thanks for your open feedback!

I can certainly understand your thoughts. In the end CNN architecture and layers are really enhanced and it can feel a certain magic is happening here.

Still, let me try to demystify it at least to a small extend: convolution is a really well established and recognised signal processing method in control and system theory, see also this thread: How to Calculate the Convolution? - #2 by Christian_Simonis. So when you come rather from a Mechatronics Engineering world (like I do) and used to apply autocorrelation, do some audio or image processing, apply Laplace or Fourier transform, time series analysis and filtering, it rather seems a logical step to incorporate this filter option into layer architectures and allow the net to learn parameters on its own using tons of data.

You are also right that interpretability of complex architectures might be rather limited. Still there are possibilities, like heat map analysis, see also this thread: What makes the different neurons in a layer calculate different parameters? - #7 by Christian_Simonis

[ One side note: convolution also plays an important role in probability theory and also provides a nice illustration why so many distributions in reality are normally distributed.

E.g. if you would convolute two uniform distributions, you will get a triangle distribution (like in the distributions of the sum of points on two dices)
if you keep adding more uniform distributions (corresponding to the sum of points on three (4,…) dices), you will approach gradually a normal distribution shape.
So convolution is really well established in many mathematical core concepts.

]

I would be interested, do you have suggestions if some real world applications of 1D signal processing with convolution would have helped here… what do you think, @Matthias_Kleine?

Best regards
Christian

Matthias_Kleine · February 12, 2023, 11:02am

Hi Christian,

thanks for your detailed reply and the provided additional links.

As you have some background in signal processing, I take the opportunity to ask you if you could recommend some introductional book or course on especially this topic. I found some course by Mike X Cohen (which I know from some other course) here: https://www.udemy.com/course/signal-processing/, which seems to be a decent introduction, also covering convolutions.

I have special interest in time series, especially in analysing two or more signals and their relations.

Coming back to the original topic, you are asking:

and my clear answer is “yes”. I think anything that “anchors” the convolution operation and make it more vivid or gaspable, would clearly help. As I am interested in time series, 1D signal processing would of course be of interest for me.

However, on the “higher levels” there will probably be some arbitrary choices left which are not really understandable by just knowing the convolution operation better, for example the choice of how many layers, of the exact dimensions asf.

Best regards,
Matthias

Matthias_Kleine · February 12, 2023, 11:19am

Just saw that the playlist to which belongs the video in your first referenced link seems also to cover signal processing very decently:

Christian_Simonis · February 12, 2023, 11:36am

Personally, I had to learn these concepts at university. Since you are also German speaking, I will just link to the course content here:

A good book, even though more theoretic was Signal- und Systemtheorie from Frey / Bossert (also German) which I got in 2013 and still use today occasionally. Also several examples are included - no promotion:

When it comes to time series analysis, it depends a little what you want to do. If you want to go for predictive analytics (basically what I was doing for several years), I learned super much from this book (also German) - no promotion:

Here you can find also an application of time series analysis and ML.

But I would like to point out that especially applying these concepts in projects and practice are more important and help these concepts like conv filtering to become „ingrained“.
So, personally I do not know this udemy course you posted but it looks quite interesting from the preview material. So I guess it can be a good start for sure. I believe it can be especially helpful if you can transfer concepts and apply them hands-on.

Best regards
Christian

Christian_Simonis · February 12, 2023, 11:50am

Yea, absolutely! There are great tutorials on YouTube. For example this one might also be interesting for you: But what is a convolution? - YouTube

Best regards
Christian

paulinpaloalto · February 12, 2023, 5:37pm

In addition to the great and fact filled discussion you’ve already had here, I had a couple of thoughts to throw into the mix:

When you say that you feel that the networks have gotten away from the mathematics, I would just suggest a different way to view that: there has to be mathematics there, but it’s just too complex for us to understand or be able to precisely explain at this point. As a mathematician, that feeling shouldn’t be unfamiliar. It happens all the time and we (or people smarter we are) have to continue working hard to get at those explanations. It’s not directly applicable to your convnets doubts, but as an example of some of the math that’s been understood in the last 10 years that weren’t before this paper from Yann LeCun’s group about solution surfaces and proving that there are reasonable solutions to the “local minimum” problem seems like a good example. Mind you, I’m not claiming I understand the math in the paper.

Or think of it as analogous to the difference between theoretical physics and experimental physics. It has happened many times in the history of science that people have constructed an experiment that can’t be explained with the current state of the theory. Then the theorists have to sweat for a while until they can eventually enhance the theory to explain the phenomena. Dirac Scattering and the precession of the perihelion of Mercury come to mind as examples.

And think of biology and medicine. There are comparable examples there: you can get a drug approved by the FDA (in the US anyway) by running rigorous experiments to prove that it is safe and effective. But you’re not actually required to explain the “mechanism of action” of the drug. Needless to say, people feel more comfortable if you can, but it’s not required for approval.

One other more concretely applicable thought would be that maybe you just stopped one week too early. There’s a really interesting lecture in Week 4 titled “What are Deep ConvNets Learning?” that is really worth a look before you give up here.

Thanks for the great discussion!

TMosh · February 12, 2023, 7:05pm

Time series are also in the domain of Recurrent Neural Networks (Course 5).

Matthias_Kleine · February 12, 2023, 8:48pm

Good points, so there is hope for understanding and more “engineering” approach.

Only the FDA example does not convince me at all …

Best regards
Matthias

TMosh · February 12, 2023, 10:45pm

An awful lot of ML system design is trying wacky stuff based on what other people have published, then a new paper is published if they find something that works better.

It’s not so much a top-down engineering- or mathematics-based process.

JohnPaul_Adimonyemma · February 20, 2023, 4:09pm

Hi @Christian_Simonis, please recommend the best English version of these textbooks.

Thanks.

Christian_Simonis · February 20, 2023, 6:24pm

Hi @JohnPaul_Adimonyemma

Welcome to the community and thanks for your question!

I’m afraid I’ve only studied this literature in German and don’t have a really good substitute English. Please note that this is more of a niche for systems and signal theory and processing in mechatronics and may not be the first choice for ML depending on what you want to achieve.

Anyway:
@paulinpaloalto has put together an excellent list which you can find here:

From my personal experience I can underline the mentioned books from Chollet & Goodfellow, too. I read them in 2018 during a 2 week holiday and I learned a lot!

Best regards
Christian

JohnPaul_Adimonyemma · February 20, 2023, 6:49pm

Thanks for your response. However, I am looking at Time Series Related textbooks. If you can help with some recommendations

Christian_Simonis · February 20, 2023, 7:15pm

Hi there,

in these threads you can find some inspiration for time series literature and books:

Books, articles, etc. for Non-Stationary Time Series - #2 by Christian_Simonis
How to understand the maths in generating the timeseries with different patterns - #3 by Christian_Simonis

It could also make sense consider this course:

https://www.coursera.org/learn/tensorflow-sequences-time-series-and-prediction

Best regards
Christian

Topic		Replies	Views
Need some practical advice on choosing from different CNN model architectures Convolutional Neural Networks	3	346	October 26, 2023
Implementing Convolutional Neural Networks AI Discussions ai-discussions	3	136	April 29, 2024
Meet your mentors - DLS Course 4! Convolutional Neural Networks	2	2185	April 15, 2021
Course Advise - Post PhD AI Discussions feedback , ai-discussions	2	142	March 30, 2023
Convolution Confusion (YOLO/UNets) Convolutional Neural Networks week-4	16	456	May 11, 2024

My thoughts on CNNs

Related topics