I recently completed the first two courses in the Machine Learning Specialization on Coursera and have been trying to apply what I’ve learned by participating in beginner-level competitions on Kaggle and exploring some DrivenData challenges.
However, I noticed that I’m struggling when it comes to effectively cleaning the data and drawing meaningful insights. Specifically, I find it difficult to decide which features to include or exclude based on their relevance to the target variable — something I’ve seen other participants do quite well.
I’m wondering if it would be beneficial to first take a course on Exploratory Data Analysis (EDA) or a similar topic to strengthen my understanding. If so, I’d be grateful if you could recommend any good resources.
On the other hand, it’s also possible that I may have chosen datasets that aren’t very beginner-friendly. If that’s the case, I’d really appreciate any suggestions for simpler datasets that are well-suited for someone at my level.
Hi @Karthik14, this is a great question, and I relate to your experience. Kaggle is an intimidating place, and many times, I didn’t understand the code or the solutions. Keep in mind that this code (at least the top solutions) is written by the top percent of data scientists and machine learning engineers, so it might be difficult to understand.
What I did is try to learn more Python, many times lack of understanding of Python leads to a poor understanding of the code and solution, also I concluded that having a broad understanding of every field of data is what makes the difference, for instance, a medical student learns from Obstetrics to traumatology, it doesn’t matter what you want to master you need to learn the broad aspects of medicine, the same applies to this, understand data analytics, data engineer, data science and machine learning engineer is what would make you stand out no matter the specialized path you choose.
Currently, there is a specialization going on for all the fields I just mentioned. I haven’t taken the data analytics specialization, so I cannot say.
I took this career path on Dataquest data analytics and it was really useful for the incremental learning approach, and the practical exercise.
@Karthik14 Since you’re looking to improve your Exploratory Data Analysis (EDA) and feature selection skills, here are some great courses to check out:
I really appreciate your suggestions and will definitely work on gaining knowledge in the areas you’ve highlighted. Also, thanks for recommending Dataquest. it looks like a fantastic resource, and I’m excited to give it a try.