STT for stroke victims and unrecognisable speech

At the start of last month, my grandfather had a stroke. This has made half of his face muscles stop working, making him unable to speak properly. I am wondering if there would be a way to train a speech to text model to understand his speech, and transcribe it into text. Enabling him to communicate again.
If you have any Ideas and/or suggestions on how to do this, please contact me.

For added context, he speaks Japanese.

here’s an idea (not tried but I’ve been working on ST models)… if you understnd his speech it can be done. create a training data of audio and text from his speech samples, say upto 5 lines per row. we cn then use it to fine tune existing models.

1 Like

Should be at least 100 or 500 samples I’m not sure, but hppy to help in a joint project. PS my grampa had speech issues so I know what its like.

1 Like

Thanks,
I’ve never worked with audio before, so I’m not too sure what I’m doing :sweat_smile:
how do you think I should finetune the model? would LoRA be a good way?

Okay I’m planning to try out fine tune a whisper model if you can produce the data. I am currently working for a call centre speech model so will be good practice for my skill. Can we start by getting a very small sample of 10 in a couple of days I’ll use it to warm up and understand the challenges involved.
Look forward to Collab…

1 Like

Hi Taizo,
Any thoughts on how to proceed…
Best, Ravi

Hi Ravi,
I’m trying to collect some the training data. But sadly, I don’t think I’ll be able to get any training data for a little while (probably a couple of weeks).
I’m currently at a bit of a standstill, so I don’t really know how to proceed.
Best regards,
Taizo

I understand, hope all will be well. I’ll try some things at my end anyways.
Ravi