I am currently crediting the Deep learning Specialization, and have so far finished the first two courses.
How does one internalize the ideas and be able to implement them freely? For example, in Course 1 we built a neural network from scratch, I understand how to build a neural network and the theory behind it, but if I’m randomly told to implement it from scratch, it’s going to be very laborious to plan and code the whole model, even though I can write down the logic behind the model on paper. It seems to me like this is why there exist frameworks like TensorFlow, but am I losing out on something?
How should I make the best of my time in learning from the specialization? Is it by reading the relevant papers? Implementing each assignment from scratch or maybe something else?
Just for context, I have a master’s degree in mathematics and am hopefully going to be moving on to doctoral studies.
If you are going to apply the techniques you learn in DLS, the usual way is to use a framework like TensorFlow or PyTorch. Since you’ve finished Course 2, you have seen Prof Ng’s introduction to TensorFlow and how to apply it in at least one case. In Courses 4 and 5, you’ll continue to learn more about how to apply TensorFlow. As you have observed, Prof Ng believes that it’s important for us to understand how the algorithms actually work before we switch to using TF. He will follow the same pattern in Course 4: he’ll show us how to build a Convolutional Neural Network ourselves in python and numpy and then we’ll switch to using TensorFlow for more complex solutions. There are a couple of reasons for using this pedagogical method:
You really need to understand the different types of networks, how they work and what types of problems different networks are useful for solving. If you only learn TF, then everything just looks like a recipe based on some “black box” and you miss out on the intuitive understanding of why a given network is the right choice for a particular type of problem.
The second level of issues is that solving real world problems is not always a simple “cookie cutter” process. Frequently you find that the first thing you try doesn’t work very well or perhaps it sort of works, but needs some serious adjustment or tuning. If you only know things at the level of TF APIs, then you may not have the intuition for which direction to go when the first thing you try doesn’t work so well. Having some understanding of what’s actually happening with the algorithms gives you a better chance of having useful intuitions about how to get to a real working solution.
The state of the art these days is pretty advanced, so creating all the algorithms yourself from scratch is just too much work. It’s a better approach to let the TF and PyTorch (and other platform) developers implement all the state of the art algorithms in a well tested and performance tuned framework. But knowing how to apply those state of the art algorithms also requires the knowledge the Prof Ng is giving us in these courses.
@paulinpaloalto Thanks for the elaborate reply!
Given enough time, I can implement the algorithms from scratch, Would you recommend that I rebuild a neural network from scratch without looking at the given notebook?
Also, what about the second question?
As usual, @paulinpaloalto provides some good insight. Here’s my take.
In my mind every neural network has 3 common elements: input layer, hidden layers, output layer. Input and output have dependencies on the external context that may involve/require pre- and/or post_processing (eg image normalization, non-maximum suppression). The hidden layers must implement the proper downsample to get from input shape to output shape as well as implement the proper transformation(s) to achieve useful outcomes. Every NN does this, so its really important to understand that idea. Writing a simple NN completely yourself is a great way to do it. It’s important to understand what the activation layer contributes, so pick one. But after you write a sigmoid function, there is rapidly diminishing benefit from writing your own ReLU, tanh, softmax etc. I can’t recommend spending much time at that level. If you understand what a convolution is mathematically, maybe from exposure to Fourier Transforms, do you understand better after writing your own? Maybe not. And once you have written your own 3-layer network, understand how data flows to and is transformed by each layer, do you need to write your own 10- layer NN? 20- ? Probably not.
What I have personally found extremely helpful is taking one of the industry algorithms and doing a deep dive on it. For me, it was YOLO. I spent months reading the papers, reading open source implementations from darknet and others, and finally trying to reproduce it myself on a public data set. I reused the architecture and TensorFlow/Keras implementation of all the layers (convolution, pooling, activation etc) but wrote my own loss function and training loop. Really beneficial. I put some of my digital exhaust in these forums, which you can find through my @ai_curious profile.
Speaking of papers, i do recommend reading them. Maybe with your background they will be directly accessible. For me, they are often a struggle, since these days I tend to glaze over when I see too many Greek letters. I find these papers are generally written by people very deep in the field for an audience of their peers. If you don’t already know what they are talking about, sometimes it isn’t easy to figure it out. Nonetheless, i think it is good to do, and I tend to circle back and reread the original papers periodically after my own knowledge and understanding advances. The papers provide a good historical context and often refer to one another as each group builds on and tries to overcome issues and liabilities of the solutions that have been published before.
Ultimately the path to take depends on the destination. Do you want to invent new NN architectures? Improve runtime performance of existing architectures? Apply existing architectures to new business problems? In my opinion the closer you are to the domain, the less you need to focus on the gears and pulleys…leverage the frameworks and pretrained models. The more interest you have in how NN are producing their output, the more attention you need to devote to what’s behind the curtain. HTH
Great question Ananthakrishna.
I think what concerned me about the first two courses is - how easy the concepts are in deep learning - I spent time constantly looking for the difficulties. (I think I annoyed Paul over this in a few posts.
Once I understood this and worked out exactly how the matrix calculations were done in Numpy and applied this knowledge to the ‘process’ defined in some of the introductory graphics that Andrew presented I became much less agitated by the way the course seemed to gloss over some critical concepts.
The more I research the topic the more I believe that there is no ‘core theory’ of deep learning. There is combination of statistical and chaos techniques which somehow create a filter that proves extremely useful in identifying patterns. The AI community has latched onto these without needing a core understanding.
In terms of your modelling - your dilemma is very like the choice between using a high level language and assembly programming. You can create a model from scratch if you want to deep-research the topic, but if you want to create practical (money-making?) models then use the work of the thousands that came before you and use the frameworks.
Although it needs a steep learning curve the Virtual Studio framework from Microsoft has an excellent set of NN tools that I am really enjoying playing with - and it is absolutely free.