Algorithms vs packages

Personally I prefer writing my ML algorithms from scratch instead of employing packages like scipy. Although its seems time consuming to say writing out all the equations for linear regression problem in contrast with using the linear regression function of scipy. I like the idea of being able to control my codes myself… understand how things work in the background.

Which do you prefer and why? Or is there a better time or situation to choose any of them?

This is a great question! There is room for many thoughts and opinions here, but I’ll throw out a few to get things started:

At the high level, the real question is what your primary goals are. If you want to become an AI/ML researcher and contribute to the field by developing new techniques and algorithms, then writing your own code from scratch is clearly the way to go. On the other hand, if your primary goal is to take the techniques of AI/DL/ML and apply them to solve “real world” problems, then using packages is probably the better way to go. The problem is that there are a lot of algorithms to develop and the waters get deeper as you include things like dynamic learning rate management, Batch Normalization, Pooling Layers and “skip” layers (propagating gradients gets complicated) and so forth.

I think most people who are using ML techniques to solve problems are using packages. The real question is which package to choose. I don’t think scikit/sklearn is one of the major choices. It sounds like TensorFlow and PyTorch are the most widely used at this point. Prof Ng uses TF in these courses. You can learn about PyTorch in the GANs specialization.

You’ll notice that Prof Ng’s approach in all these courses is first to show you how to code at least the basics of the algorithms yourself “by hand” directly in python and numpy. Then once he’s shown you that, he converts to using TF and Keras (a subset of TF). The reason for showing you how the algorithms really work is that gives you important intuition and understanding about how to make hyperparameter choices and how to decide what to do when you try something and it doesn’t work very well. If you only learn about things at the level of TF, then it’s all a “black box” and you don’t really have that level of intuition about what to do when you have underfitting or overfitting or the like.

The switch to TF first appears in Week 3 of Course 2 for the case of Feed Forward Fully Connected Deep nets. Then in ConvNets (Course 4) it happens in the second assignment. In Sequence Models, it happens in Week 2 I think.

Even if your goal is to become an AI researcher and invent new algorithms, you’ll probably end up adding more code to the packages, so you probably want to learn how to use those as you go along. Having more ways to approach a problem gives you more flexibility.


Wow! Thanks again sir @paulinpaloalto … I am beginning to see the clearer picture!

The instruction paradigm follows a general pattern of 1) do it with procedural programming where you implement nested for loops and write almost everything yourself, 2) reimplement relying on Python broadcasting to handle matrix operations 3) replace the handwritten stuff with third party libraries and packages (eg TensorFlow and Keras)

Once you understand how an ML operation works, Convolution, let’s say, IMO there is no reason to write it oneself or continue to use your own code. Highly unlikely that code written by a single individual will perform as well, have as much flexibility, or be as robust. If you really want to see how things are working or even tinker with yourself, TensorFlow is open source and you can find almost all the code on Github.

There is a reason people don’t generally write applications using machine language/assembly anymore. Libraries and packages for ML are the same - they allow you to work more productively and focus on higher value tasks.

Thanks @ai_curious for pointing these things out! I am really understanding it better now.

Great discussion, which reflects most of my thinking on the topic. Few points to further consider:

  1. Writing your own algorithms is a great tool to learn the basics behind the key operations in DL models. You have to go through these steps to better understand the mechanisms applied in deep learning. When I took Prof. Ng first-on-Coursera machine learning course in 2014 in all assignments we had to write the ML algorithms, and we did that in Octave (a spin-off of Matlab). Python wasn’t even around as it is today, not to mention TensorFlow or Pytorch.
    The advantages of this “constraint” is clear to me today. Although I don’t write convolution layers anymore, and use TF or PyTorch, there is nothing that is really a “black box” for me in the process, and this is a huge plus.

  2. As mentioned above, moving forward beyond basic learning, you will always want to prefer using frameworks for solving “real world” problems. You will never be able to compete with the efficiency of the algorithms in those frameworks which are constantly updated. Having said that, if you think you have some better approach for some basic algorithm, or something new, why keep it to yourself? Both Tensorflow and PyTorch are open source. Go to GitHub and contribute your invention.

  3. Speaking on GitHub, even if you don’t have something extremely better to add, this environment could be a great tool for you to practice. Get involved and respond to open issues. This process will train you in those basic algorithms that are used in production oriented DL frameworks!

  4. Maybe not that important as the previous points, but I am hearing that in coding interview assignments candidates are asked to code basic DL algorithms (a convolutional layer, e.g.). So, knowing only TF will not get you much in those stages in your job seeking.


Wow!.. @yanivh thanks alot for this insight!.. on behalf of myself and every ML newbie that will eventually read this, I deeply appreciate this information.