I’m quite new to deep learning, and I just started de Deep Learning Specialization course.
I have a question. In the video “Supervised Learning with Neural Networks” we see that different architectures are used to solve different kind of problems.
How can I know which architecture type is better to solve the problem at hand? If I create a custom architecture, how can I know beforehand that it can be a good candidate to solve my problem?
In the course, you will learn about various neural network architectures that are highly effective in solving different types of problems. The choice of architecture depends heavily on the nature of the input data:
Structured Data (e.g., databases with well-defined features such as house prices, user information):
Use standard feedforward neural networks (FNN), which are versatile and can be applied to problems as diverse as predicting real estate prices or online advertising, where you have a clear set of input features and an output goal.
Unstructured Data (e.g., images, audio, text):
Use Convolutional Neural Networks (CNN) for image data.
Use Recurrent Neural Networks (RNN) or variants such as LSTMs for sequence data such as time series, audio, or text (e.g., for machine translation or speech recognition).
For some applications, such as autonomous driving, you may need a custom or hybrid architecture. For example, autonomous driving systems may require a combination of CNNs for image recognition (from cameras) and other types of neural networks (such as radar data processing). These custom architectures integrate multiple modalities (e.g., vision and radar) into a larger system.
It’s a challenging task to know in advance if a custom architecture will work well without experimentation. By starting with standard architectures and gradually experimenting with customizations based on the problem specifics, you can develop a network well-suited to the task. These are some general principles you can follow:
Avoid mismatch problems: Ensure that the type of architecture matches the data (e.g., CNNs for images or RNNs for sequences).
Avoid reinventing the wheel: Start with known architectures that have performed well on similar tasks (like ResNet for image classification). Fine-tune, benchmark, and use pre-trained models.
Validation and tuning: Use validation data and hyperparameter tuning (e.g., learning rate, layer depth, number of units) to fine-tune your architecture. Performance on validation data gives you feedback on whether your model is overfitting, underfitting, or needs architectural adjustments.
The other general thing worth saying here is that taking all the DLS courses is a great way to learn both about what types of network architectures are possible and what types of problems each architecture is appropriate for solving. So in addition to the concrete advice already offered above on this thread, please “hold that thought” as you continue through the various courses here. You will learn a lot and see many examples. Just as one concrete point, note that one of the big topics of DLS Course 2 is evaluating the performance of your solutions and figuring out how to improve the results.