Need some practical advice on choosing from different CNN model architectures

Hi everyone. Today I finished the CNN course which I’ve been studying for about 2 months. I t was a great course and thanks for everyone’s contribution in giving a very timely response to my questions and clearing my doubts. I appreciate that a lot.

I would just like to discuss a few things. In this time period I learnt the fundamentals and mechanisms of how CNNs work. I also took lectures on a few research papers that studied a few classical CNN models and some advanced stuff like ResNets, Inception Network, MobileNet, EfficientNet etc. Following that I studied Detection Algorithms, with a primary focus on YOLO Algorithm. I also briefly studied Regional Proposals, Semantic Segmentation, R-CNN, Fast-RCNN, Faster R-CNN, U-Net. I also learnt Face Recognition and Verification Models like Siamese Network and also covered a little Neural Style Transfer.

I am now looking forward to build some projects. Most probably on object detection and image classification. After consuming all of the stuff from this course, I am confident enough that I can build an application in the real world, though I still have a few questions and need to talk to someone who can channel my thoughts in the right direction.

If you could give me just a rough overview of how you approach a computer vision problem that’ll be great. Especially, when you see a computer vision problem to solve, how do you make decision on which architecture to choose from to solve a given problem at hand. Since there are many architectures and research papers and every architecture works in a unique way to solve unique problems, how do you know which one to choose from? How do you make your way down from 100s of options to choose from, to a few where you can then start experimenting with those few options? Just need some practical advice on approaching an object detection or image classification problem.

Also, there might be some knowledge gaps that I have, I feel like I have em, but I don’t know what I don’t know at this point. So, I just need someone who can maybe channel me in the right direction.

That’s all, and once again thanks to everyone for responding to my queries always.

My approach to this would be, depending the project (the type of task/images) if there is a similar model trained on similar images that you can use either in transfer learning or replicat/ train from scratch, then do that as first option.

The second option would be to create a similar model as the the one above but you still have to train and test it.

The third option is to study many extensive papers and come up with you own models that you have to train and test extensively but this will require much more resources and time.

It’s great that you have completed this course. It gives you introductory knowledge of many different methods.

What the course doesn’t do is give you the experience to make design decisions.

So what you need now is experience.

I recommend you grab some datasets from the many online repositories, or perhaps from a ML site like Kaggle, and dig in. Make choices - start with simple ones, try them out, see if they work, ask questions (Kaggle has a very supportive community), and build up your bank of experience.

Try the “UCI Machine Learning Repository” for sets of data you can download.

@gent.spah @TMosh thanks for the input. I guess there is no definite answer to this question and there are many factors that play role in this and also experience building models is necessary. I have started with Kaggle, let’s see how it goes.