Seek guidance on using new datasets

course 1: Could anyone suggest what new datasets I could work on after completing the first course ?

Hi @computer_en,

You can try some datasets from UCI Machine Learning Repository. For example, Heart Disease Dataset or Energy Efficiency Dataset might be interesting to practice the concepts from Course 1.

1 Like

Congratulations on completing Course 1 of the Machine Learning Specialization! Here are a few datasets you might consider:

Regression Datasets:

  1. California Housing Prices

    • Available in scikit-learn (sklearn.datasets.fetch_california_housing).
    • Predict housing prices based on features like population, income, and housing characteristics.
  2. Kaggle: House Prices - Advanced Regression Techniques

    • Link
    • A classic problem for regression with diverse features including categorical data.

Classification Datasets:

  1. Iris Dataset

    • A small, well-labeled dataset for multi-class classification (sklearn.datasets.load_iris).
  2. Titanic Dataset

    • Link
    • Predict survival rates based on passenger features.
  3. MNIST Handwritten Digits

    • Available in many libraries like TensorFlow or scikit-learn.
    • Good for practicing classification on image data.
  4. Breast Cancer Wisconsin Dataset

    • Link
    • Predict whether a tumor is malignant or benign based on features like size and texture.

Mixed (Regression and Classification):

  1. Kaggle: Bike Sharing Demand

    • Link
    • Predict the number of bike rentals (regression) or classify high/low demand periods (classification).
  2. Airbnb Listing Prices

    • Many variations are available on Kaggle.
    • Predict rental prices or classify listings into different price tiers.
1 Like