What is the best pre-trained deep learning model to be used for a real-time audio classification system on a microcontroller such as Raspberry Pi 3? Added bonus if the model can be audio-to-audio meaning the output depends on whether the audio is passes the classification or not.
@mistafo11 it is not clear from your statement what you intend by ‘audio classification’, but I’ve played around a bit with Spotify’s ‘Basic Pitch’.
Still not ‘real-time’ though, you need a few more steps and perhaps a condensed model.
https://basicpitch.spotify.com/
If you mean voice (in English) then check out OpenAI’s Whisper.
https://openai.com/index/whisper/
Still, also, not real-time at least ‘out of the box’.
@mistafo11 though hmmm; News to me, so due caution:
@Nevermnd thanks a lot for the reply!! I will check them out and try to play around and see if it is possible
@mistafo11 I would just add I have a RasPi 3 for a very similar project I was trying to work on, and if I remember correctly the built in audio port is not mic’d. It is ‘audio out’ only.
So you will need a cheap USB soundcard to get it to work. Almost any one will do, but I’d research a little and just check the chipset is supported by whatever variant of 'nix you’re running on it.