My business partner and I are developing an iOS app that is designed to help piano students improve their technical exercises (scales, arpeggios etc) for their music exams. One of the best features of the app would be an ability to listen to the student play on an acoustic piano and then be able to point out any wrong notes they might play. This would help the student practise at home and track their progress.
However, despite quite a bit of research and prototyping, I’ve not found any library or LLM that can accurately (i.e. 100% accurate) detect pitch when a student hits a key on an acoustic keyboard. This difficulty stems from the various harmonics of a piano string and the instrument itself, background sound etc. Various other app vendors offer similar or related products but none offer right note (i.e. pitch detection) functionality for the same reasons I’ve found so a feature like this would be a world first
In addition, students must play some scales “hands together” - i.e. two notes at the exact same time. So pitch detection does need to identify more than one pitch played at the same time. Higher level students must play scales at relatively fast tempos (i.e. the pitches change quite fast) and be able to play scales that span 4 octaves (i.e. the pitch frequencies vary quite a bit)
The iOS native libraries I’ve tried use the Fast Fourier Transform (FFT) and along with various other filtering and processing techniques to try to identify pitch. They do OK but not good enough to provide reliable and always correct feedback to students, especially for polyphonic (i.e. two notes at a time) playing. There are a few LLMs out there that I’ve tried but they also have the same problems with accuracy. E.g. Google’s “Onsets and Frames” model.
Does anyone have any ideas how AI could help get correct note detection working reliably for such an app as ours?