Hi, I’m currently pursuing a Bachelor’s in Data Science and I need to choose a project idea for my final year. I’ve been trying to come up with something new but it seems like whatever I think of, someone has already done it. Could anyone suggest some unique and impactful project ideas?
ML is getting to be a very well-explored topic. “Things that haven’t been done before” and are still within the scope of a bachelor’s degree education are difficult to find.
Hi @Alizaib,
I’m a new user here. I have a Bachelor’s degree in Economics and I’m currently pursuing a Master’s in Data Science.
Although I can’t provide you with a novel idea, because you have to find it by yourself, I can explain what I did for my Bachelor’s thesis.
First, you have to manage your expectations – it’s incredibly rare that a student at a Bachelor level will advance the state of the art of a particular field. Second, have you already decided what type of project is it going to be (theoretical, practical or a literature review)?
In my case, I did the following:
- Read a lot of articles in newspapers and blogs.
- Choose the programming language (R).
- Search for data. I started with aggregated data at a country level on Eurostat and gradually decreased to a municipal level by exploring other sources. As a general reference, take a look at the Awesome Public Datasets repository.
- Search for relatively new papers (published not more than 10 years ago) on Google Scholar and verify that there’s a package for the one that I’ve selected. I didn’t have enough programming experience to implement a solution from scratch (explore Papers with Code if you haven’t already).
- Apply the paper and the package to a specific research question: does income vary spatially? If yes, then how?
As you can see, this was a practical project and I knew back then that nobody else has applied that exact method to the country in question. That’s why I received a high distinction. It was an iterative process where I had to change the data three times (it took me 3-4 months to process it).
To sum up, I got inspiration by reading a lot and by thinking about how I can solve a particular question.
Hello, @vgrz, thank you for sharing your experience with us. I have bookmarked it so that I can refer future learners with a similar question to your sharing.
By the way, welcome to our community!
Cheers,
Raymond
How about trying to predict sporting event outcomes, like the result of NHL matches? This will be interesting because your model probably won’t be that much better than using average scores alone, but it will tech you a lot about data sets that are hard. A bit of courage might give you an interesting project!
Predicting the results of sports matches is extremely difficult, because 1) there is an element of chance in any live events, and 2) changes in the roster due to injury, illness, or being in a bad mood, or having a contract dispute, home/away venues, wet grass, dry weather, full moon, the influence of the current league standings (i.e. the team benches their starters because they’re already guaranteed into the playoffs), and a lot of other intangible factors, are very difficult to include in a training set.
Models of sports leagues tend to be useful only in retrospect (i.e. they’re commonly used for team rankings for playoff seedings), because they have little predictive power.
Physics Informed neural networks are super cool. You could use Gen AI to create some synthetic data and collect real world data of a ball dropping. Measure the force of the ball dropping and then use a PINN or an RNN to predict the actual results.