Using XGBoost XGBRegressor()

Hello all,

I think that you may find this example for using XGBoost on a CSV dataset file useful: Regression Example with XGBRegressor in Python. The dataset is in Kaggle site: The Boston Housing Dataset. I also attached my JP notebook (but please remember that I am a student like you), and as Prof. Andrew Ng says at the end of this lecture: “may the forest be with you”.
boston_housing_XGBregressor.ipynb (108.4 KB)

3 Likes

Hello @tsvika_greener,

Thanks for your sharing! I wanted to download your notebook but I encountered a download error, so unfortunately I am not able to read it.

Raymond

1 Like

Hello @rmwkwok,

If you want you may download the pdf of this particular JPN or maybe try again to download the JPN itself from here.

I will take this opportunity to say that I would like to have SciKit Learn and XGBoost also in my inventory as a data scientist wannabe. What say you about FUN (France Université Numérique) MOOC course on SciKit Learn?

Thanks for your response :slight_smile:

boston_housing_XGBregressor(1).pdf (264.5 KB)
boston_housing_XGBregressor.ipynb (108.4 KB)

Hello @tsvika_greener,

That’s strange. I cannot download both the pdf and notebook again. Can you download them or is it just me who cannot download them? Can you upload the notebook to Google drive and then share a public link here?

For the course, I never take it, and I never take any course to learn sklearn.

If you ask me about the difference between this course and the MLS, I think Module 3, 6 and 7 are not covered by the MLS. Ofcourse all Modules should be presented with sklearn code which is also something that the MLS doesn’t do.

If you ask me whether you can learn those skills without taking the course, then I would say we can, and we can start with googling. For “Module 3: Hyperparameters tuning” for example, I would google “Hyperparameters tuning sklearn” and the first webpage link is already an official sklearn tutorial. The first video link is also a tutorial.

If you ask me which parts of the course are most frequently used, I would say Module 2, 3 & 7. Mastering those skills are important.

Cheers,
Raymond

Hello @rmwkwok,

Thanks for your comprehensive answers, I really appreciate it.

Please find below a link to the files that you couldn’t download from my previous messages.

boston_housing_XGBregressor.pdf - Google Drive, boston_housing_XGBregressor.ipynb - Google Drive, housing.csv - Google Drive

Hello @tsvika_greener,

The download error is my problem… but I can open it on Google drive, thank you!

Your notebook is a good start, and if you allow me to give you one suggestion, then you may want to learn more about hyper parameter tuning in order to make the best of cross_val_score. The core idea is that you want to tune your model’s hyperparameters to achieve the best cross-validation score. At present, you are using the default values for all the hyperparameters -

Screenshot from 2022-10-13 19-50-59

and at present, the 5-fold cross-validation score for the default hyperparameter values is 0.83. Perhaps a different set of hyperparameter can achieve a even better one.

Great work @tsvika_greener, and keep trying!

Raymond

1 Like

@rmwkwok thanks for your helpful comments, i will tune the cross-validation parameter to get a better score for it.

Thanks again for you helpful comments :slight_smile:

1 Like