Offering project ideas for collaborations

Amir_Pasagic · February 25, 2026, 9:24pm

Hi there, not a very unique post I am sure, since many of us are beginners trying to build portfolio and extend our knowledge both width and depthwise.

I have personally started (or got involved in) a number of mini-projects from different ML domains (all of which are ongoing WIP cause life is full of distractions), and I would be happy to collaborate on some of them, hear ideas for alternative approaches, learn from each other etc.

I’d aim for people who are beginners like me, but with some solid foundations and intuition for basic ML concepts and who also have some projects behind them, so we can motivate and learn from each other, but I am also happy to help out with more basic problems if I am able to.

This is a very long read so here is a TL;DR chatGPT summary for these with insufficient time or attention span

Reinforcement Learning repo (Gymnasium-based): Implementing core RL algorithms (e.g., Deep Q, A2C, REINFORCE) from scratch on discrete and continuous environments, including training Deep Q to play Breakout, with documented debugging insights and plans to expand toward more advanced environments.

Unsupervised Learning on EEG data: Pretraining models on large-scale unlabeled 128-channel EEG time-series using contrastive learning, followed by supervised fine-tuning to predict response times; plans to extend with VAE, masked reconstruction, and latent space visualization (UMAP, t-SNE).

Audio Classification & MIDI Transcription: Detecting drum hits (kick, snare, hi-hats) and transcribing audio into MIDI using STFT and temporal CNNs; future extension to pitch detection, instrument classification, and full melodic transcription—motivated by music production in Ableton Live.

RAG-based Job Matching Pipeline: Building a lightweight RAG system using LLMs and Chroma to scrape job listings, embed and store them in a vector DB, and rank opportunities based on CV similarity; currently functional for one ML job site with plans to expand via APIs.

Anomaly Detection for Predictive Maintenance: Exploring anomaly detection on aerospace flight-phase data and experimental IoT data (e.g., IMU sensor on espresso machine) for malfunction detection and predictive maintenance use cases.

Computer Vision for Posture Evaluation: Developing a CV-based system to analyze exercise/kickboxing posture from video and detect common mistakes in movements like kicks and jabs.

If you wish to know more …

Existing GitHub repos:

Below is a list of topics I cover in my GitHub, as well as some ideas for future projects. Perhaps you will find some of it interesting:

Reinforcement learning repo: supposed to contain different reinforcement learning algorithms, mainly trained on Gymnasium type environments.

Currently quite limited, but I plan to expand it and would be happy to learn RL together.
Idea is to write basic RL algorithm such as DeepQ, A2C, REINFORCE etc on number of discrete and continuous environments and play around with parameters to gain better insights into how things work for these who are just starting, before moving on to more ready made solution, such as Isaac Gym/Lab.

One cool entry project was getting DeepQ algorithm to play classic Atari game Breakout. I ran into some issues along the way and I documented it in Medium article documenting the debugging process (with the help of the users here). Hope it may be insightful for others.
Unsupervised learning on EEG data. I recently entered a competition in which the task was utilizing vast amount of unlabeled EEG data to pretrain the model using unsupervised learning to extract meaningful features from the 128 channel timeseries data (I personally went for contrastive learning approach).

Model is then trained using supervised learning on much scarcer labeled dataset to predict response time of subjects in performing some detection and reaction tasks.

Goal of the first part was to show that supervised learning has helped model abstract some basic features out of the signals and learn that similar signals belong closer together in the representation space.

Results indicate that this was achieved, however I want to further improve pre-training with VAE (variational autoencoder), masked reconstruction etc. Also I would like to make some cool visualization using UMAP and t-SNE to show how latent space changes during pre-training Its a great project to learn more about unsupervised learning.
Audio classification and transcription into MIDI.

I am currently writting a ReadMe file for this one (hopefully ready in a day or two), as its the project I am mainly focused on atm.

Idea is to analyse STFT - short sliding windows of frequency content in an audio stream and trying to detect individual drum hits (e.g. Kick, Snare, Hi-hats etc) and automatically transform it into MIDI.

Second part would be to recognize individual notes (and perhaps even classify instruments) and transcribe into midi as notes of certain pitch e.g. Piano track: C,D,E…

Its an interesting problem of multilabel-classification on timeseries and it utilizes a lot of interesting concept, such as temporal CNNs, data augmentation etc. There are also many different approaches one can try here, and the drum detection is currently slowly shaping up to actually give some sensible results

Motivation for this project is a bit silly - I sometimes write music in Ableton and when I have and idea on the go, I would like to just hum, whistle or beatbox it into my phone and then have a draft of an idea as a midi file I can import directly in the software.
I worked a bit on RAG and generally utilizing LLM’s to work on processing data.
More specifically, I made a basic * RAG-like pipeline to fetch and analyse job data from various sources (e.g. job sites), process them to extract basic information using lightweight language models, store the data in a vector DB using Chroma, which can be easily queried based on similarity, and then further processed based on my given CV.
This way it categorizes jobs from multiple sources based on how well they fit my CV and desires. As everything else - still very much WIP, tho main pipeline is in place and processes data from one ML jobs website, tho I wish to migrate to APIs that cover larger job markets.

Some further ideas I didn’t yet manage to get into:

Anomaly detection for predictive maintenance /action - I am working in a job where we collect data from flying things, so I have a lot of data from various flight phases available, where I would like to test some anomaly detection

I also hooked up a basic IMU sensor to a microcontroller to record some data from my espresso machine to check for malfunctions or detect time for refill, but never got around to it. (was mostly excited to work with micro-controllers again)
Posture evaluation - I used some of the existing CV libraries to analyze posture during exercise. As I am doing kickboxing I was thinking about making an app to point out common mistakes I do when e.g. practicing kicks and jabs, since they are quite easy to detect in video.

A little bit more about me

If you made it this far, here is a few more extra notes about me:

I come from a control system background and have worked mostly in aerospace and aviation, but decided to go more in ML direction lately. This makes me bit more biased towards system identification, time-series analysis tho my interests go quite beyond it. Control System background gave me some solid background in linear algebra, statistics, calculus and other math relevant for understanding ML concepts.

Thanks for reading thru this behemoth of text and hope to hear your project ideas.

Yasmeen_Asaad_Azazi · March 12, 2026, 6:54pm

Hello Amir,

I really resonated with your post. I believe that encouraging each other, collaborating on different projects, and exchanging ideas is one of the best ways to stay consistent and grow faster in ML.

Let me quickly introduce myself, I’m Yasmeen Asaad. I come from a CS background and currently work as a Machine Learning Engineer at Turing, with previous experience as a Data Engineer, My current job in brief is to debug ML code written in Pytorch.

In almost every recent interview I’ve had, the focus shifted heavily toward hands on projects and the reasoning behind modeling decisions rather than coursework.

I’m particularly interested in the anomaly detection / predictive maintenance idea you mentioned.

Would love to explore this together if you’re open to it.

Looking forward to your thoughts

Amir_Pasagic · March 15, 2026, 11:00pm

Hey Yasmeen,

Thanks for reaching out, and kudos for reading thru the whole post
Feel free to reach out here or on LinkedIn and we can exchange project ideas and see if there is something we would both find interesting working on. I can think about what would be interesting anomaly detection project.

Yasmeen_Asaad_Azazi · March 24, 2026, 9:11am

Hi, thanks for reaching out! I’ve already sent you a message on LinkedIn as you suggested. Let’s take the conversation there and discuss some interesting anomaly detection projects.

Topic		Replies	Views
Create new ML project AI Discussions ai-discussions , careers , project	5	446	March 9, 2024
Seeking ML & AI Research Collaborator for Cutting-Edge RAG & Anomaly Detection Project Introductions ai-discussions , project	4	85	December 8, 2025
Unsupervised learning : K-mean , Anomaly detecaion AI Discussions careers , project	1	19	August 14, 2025
I'm seeking a partner to collaborate on a project AI Discussions project	15	591	January 2, 2025
Ideas for more projects AI Discussions ai-discussions	2	118	March 5, 2024

Offering project ideas for collaborations

Existing GitHub repos:

Some further ideas I didn’t yet manage to get into:

A little bit more about me

Related topics