Is implementation of xgboost as easy as it is in the lecture

mehmet_baki_deniz · January 17, 2023, 8:09pm

hi everybody,
Andrew argues that xgboost is the best possible model for structured data and goes on to say that people have won Kaggle competitions through xgboost.

Then he shows the implementation of gxboost as the following:

just 4 lines of code ?

and… you are the winner in a Kaggle competion…?
he also argues that the library figures out to regularize the model against overfitting.
so should I just go to Kaggle and write this 4 lines of code ?

isn’t there anything else to consider as we had for lineer regression or nueral networks ?

Mehmet

TMosh · January 17, 2023, 8:55pm

The trick to Kaggle competitions is in trying a zillion different combinations of layers and methods until you find the one that gets the highest rankings on that specific task.

Those who have experience can pare down the number of things they try into more likely areas of success.

It’s extremely unlikely you would win a Kaggle competition by using just one instance of any specific method.

mehmet_baki_deniz · January 17, 2023, 9:17pm

hi tmosh,
thank you for your response. Actually my question is not about ‘how to win a competition in kaggle’
sorry for the confusion. Definetely my bad!

I am rather asking what the issues are to consider in building a xgboost model.
Andrew’s lectures, I believe, did not give any hint on that matter as opposed to his discussion of other models. More concretely, do we really not need to regularize it as andrew implies ? dont we check for bias and variance? more or less features in the model? maybe even polynomial features?

I just presume, if xgboost is as powerful to bring you a kaggle win, there has to be more than to it than just a 4 lines of code as appeared in Andrew’s lecture.

warmly, Mehmet

TMosh · January 17, 2023, 9:30pm

Perhaps someone else with more xgboost experience would like to reply.

pastorsoto · January 18, 2023, 2:42am

Hi @mehmet_baki_deniz yes, Xgboost has several parameters that can be tune to improve the model’s performance, we also have learning rate, number of tree and some others. However, most of the improvements came from feature engineering and improve the quality of the data.

So, I would say the other side of the picture is the quality of the data that you use for your model and some hyperparameter tunning.

Topic		Replies	Views
Question about XGBoost algorithm Advanced Learning Algorithms week-module-4	3	589	September 12, 2022
Using XGBoost XGBRegressor() Advanced Learning Algorithms week-module-4	6	694	October 21, 2022
Model Hyperparameter tuning XGBOOST REGRESSOR AI Discussions ai-discussions	2	66	April 14, 2025
Is Regularization Effective At Reducing Overfitting in Decision Trees (xgboost)? Advanced Learning Algorithms week-module-4	1	264	February 27, 2024
Bias and Variance Advanced Learning Algorithms week-module-3	3	676	July 29, 2022

Is implementation of xgboost as easy as it is in the lecture

Related topics