Scikit or NumPy for Machine Learning

I need an advice on which python module to use for Machine learning between NumPy and SciKit-learn

1 Like

Hi @AIChef,

You can use NumPy for low-level data manipulation, and Scikit-learn for high-level machine learning tasks with pre-built models and utilities. Scikit-learn provides implementations of popular models like linear regression, SVMs, random forests, k-means, as well as APIs for training, predicting, and evaluating models, which can save time and reduce errors. Scikit-learn internally uses NumPy, so data compatibility between the two is seamless. For automatic differentiation you can use TensorFlow or PyTorch for high-level and efficient implementations, and JAX if you want NumPy-like syntax.

2 Likes

numpy is a basic matrix and statistics calculator for Python. It isn’t inherently a machine learning tool.

scikit-learn is a full set of ML tools.

The MLS course primarily uses TensorFLow (another set of ML tools).

1 Like

Thanks for your advise, I noticed in Course 1, Week 1 - Week 2 where i am currently uses NumPy.

  1. Does this mean eventually i would get to use Tensorflow in the MLS?
  2. How do you advice i practice building models from scratch
  1. Yes.
  2. Complete the MLS courses first.
1 Like

Thanks

Hi,
I have completed the MLS.
How do you advice i practice building models from scratch

Congratulations on completing the Machine Learning Specialization! You can start with implementation of key models like linear regression, logistic regression, and softmax regression from scratch using only NumPy with your own versions of optimizers (like SGD with momentum, Adam, etc.). At the next step Implement foundational building blocks like fully connected layers, activation functions (ReLU, sigmoid, tanh), loss functions (MSE, cross-entropy), regularization techniques (L1, L2), and combine these to create a small neural network. You can challenge yourself to deepen your understanding of automatic differentiation by making tiny differentiation engine like micrograd. Also tackle more complex architectures like CNNs, RNNs, or Transformers with Tensorflow or PyTorch.

Thanks for the information.

Is there a different platform other than kaggle where I can get dataset and practice this algorithms

What exactly do you mean by “from scratch”?

  • Without using a starter exercise for guidance?
    … or …
  • Without using any existing toolsets?

For free labeled datasets, try the “UCI Machine Learning Repository”.

I mean without using a starter exercise for guidance

Now, what platform do you want to use to create your models?

Tensorflow

That’s the tools. You need to host them on some platform.

Could you recommend some of the best platforms that are high on demand

Can you define what you mean by “best”?
What are your considerations?

I meant the most used

Possibilities:

  • Google Colab (notebook based)
  • Kaggle (notebook based)
  • Install your own local platform, for example VS Code (a programming IDE that supports installing other toolsets and languages).

I would like to use Kaggle.
Are these platforms commonly used at work?
Which ones do you use?

Kaggle is more of a tutorial and contest site.

Colab and VSCode, all the time.

I’m not in the machine learning business, but I use many different tools.