DLS - Course 5 - W3 - Trigger Word Detection

I have created the model as instructed in the assignment for Trigger Word Detection. All tests were passed and my assignment is complete now.

My question is part of the very last ungraded section where I recorded 5 of my own audio files that have the trigger word along with other negative words (.wav files). I had uploaded into designated folder. When I ran the model (a similar pretrained one given in the assignment) on my own audio files, what I observed that, it was detecting the trigger word correctly when there is no background noise. However, it is not working at all (at least 3 out of 5 files) when there are distinct noises like sound of celling fans at the background etc.

Is it a training issue? I do not think the model itself needs some modification. Pls advise.

Its probably a training issue, it seems noises with such backgrounds were not much part of the training dataset.

1 Like

That’s an interesting observation, thanks for posting it.

It would be very interesting to see how the model would change it it was trained to reject noise, and how that would vary depending on the characteristics of the noise.

An additional layer might be needed just for the noise rejection. Just a guess.

2 Likes

Ok. That might be the case. One thing I missed to mention that I was kind of framing sentence in my audio files and put the activation word in between abruptly. Hopefully, that was not too improper to create test files?

Thank you

Thanks for your response. You mean just after the Conv 1D, one more similar one?

You’d have to experiment and see if any adjustments to the model might be useful.
I was thinking that another hidden layer might be interesting, to pre-process the audio before the trigger detection happens.

1 Like

Ok. I would try that. I may need to create a lot of training examples and will try to retrain the model after adding the layer.

Thanks @TMosh. Much appreciated.