Data preprocessing

I’ve recently finished course 1 of ML specialization, so I am doing a logistic regression project named Titanic survival project.
data has some characteristics:
first some columns of dataset is string like cabin number or ticket number.
second some columns of dataset are Nan.

I don’t know how to deal with this dataset to train my model.

Hi @mahdi_khoshmaramzade, I think this dataset is available on Kaggle for a “GettingStarted Prediction Competition”. Why don’t you check out the notebooks shared by the others, then research, understand, experiment their ways first? Because in this way you can ask a more specific question to a specific technique that you have tried and share with us your experiment result. I think this is a better way here.


Hi @rmwkwok
Yes it is in on kaggle for a project named Titanic survival.
Yeah I can do that. it is my first time using kaggle so I didn’t know that I can check other’s code.

1 Like

You are welcome, and look forward to discussing what you have tried and your thoughts about it!