I need an advice on which python module to use for Machine learning between NumPy and SciKit-learn
Hi @AIChef,
You can use NumPy for low-level data manipulation, and Scikit-learn for high-level machine learning tasks with pre-built models and utilities. Scikit-learn provides implementations of popular models like linear regression, SVMs, random forests, k-means, as well as APIs for training, predicting, and evaluating models, which can save time and reduce errors. Scikit-learn internally uses NumPy, so data compatibility between the two is seamless. For automatic differentiation you can use TensorFlow or PyTorch for high-level and efficient implementations, and JAX if you want NumPy-like syntax.
numpy is a basic matrix and statistics calculator for Python. It isn’t inherently a machine learning tool.
scikit-learn is a full set of ML tools.
The MLS course primarily uses TensorFLow (another set of ML tools).
Thanks for your advise, I noticed in Course 1, Week 1 - Week 2 where i am currently uses NumPy.
- Does this mean eventually i would get to use Tensorflow in the MLS?
- How do you advice i practice building models from scratch
- Yes.
- Complete the MLS courses first.
Thanks
Hi,
I have completed the MLS.
How do you advice i practice building models from scratch
Congratulations on completing the Machine Learning Specialization! You can start with implementation of key models like linear regression, logistic regression, and softmax regression from scratch using only NumPy with your own versions of optimizers (like SGD with momentum, Adam, etc.). At the next step Implement foundational building blocks like fully connected layers, activation functions (ReLU, sigmoid, tanh), loss functions (MSE, cross-entropy), regularization techniques (L1, L2), and combine these to create a small neural network. You can challenge yourself to deepen your understanding of automatic differentiation by making tiny differentiation engine like micrograd. Also tackle more complex architectures like CNNs, RNNs, or Transformers with Tensorflow or PyTorch.
Thanks for the information.
Is there a different platform other than kaggle where I can get dataset and practice this algorithms
What exactly do you mean by “from scratch”?
- Without using a starter exercise for guidance?
… or … - Without using any existing toolsets?
For free labeled datasets, try the “UCI Machine Learning Repository”.
I mean without using a starter exercise for guidance
Now, what platform do you want to use to create your models?
Tensorflow
That’s the tools. You need to host them on some platform.
Could you recommend some of the best platforms that are high on demand
Can you define what you mean by “best”?
What are your considerations?
I meant the most used
Possibilities:
- Google Colab (notebook based)
- Kaggle (notebook based)
- Install your own local platform, for example VS Code (a programming IDE that supports installing other toolsets and languages).
I would like to use Kaggle.
Are these platforms commonly used at work?
Which ones do you use?
Kaggle is more of a tutorial and contest site.
Colab and VSCode, all the time.
I’m not in the machine learning business, but I use many different tools.