Error in Custom Trigger Word Detection

BryanEL · July 23, 2022, 6:56am

Hey guys. So i tried to do Trigger Word Detection. First, i tried the normal dataset from Coursera, and it succeed. Then i tried to do it with my own custom dataset, but somehow, it throws some error which you can see below.

I also have to actually put this in the code which is X = np.array(X, dtype=object), Y = np.array(Y, dtype=object) instead of X = np.array(X) and X = np.array(Y), because there are warning which is,
VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify ‘dtype=object’ when creating the ndarray.

anon57530071 · July 23, 2022, 7:42am

This is not recommended. You may be better to relook into your data.

Your data is so called “ragged” data. There are some inconsistencies in size, length, or whatever in your data, which do not fit to ndarray. It is also known as non-rectangular data, irregular matrix, etc.

If you uses the same model, it was created with these parameters.

Tx = 5511 # The number of time steps input to the model from the spectrogram
n_freq = 101 # Number of frequencies input to the model at each time step of the spectrogram

So, our data for training is (32, 5551, 101).

I think it is better to adjust your data, and input_shape for a model to fit to your objectives.

BryanEL · July 23, 2022, 8:18am

Hey, thanks for the reply. But how do i check my data? Most of them on average are 1 second. And also, for the background and negative, i use the example of the assignment files. So i didn’t have a custom data for background and negative.

BryanEL · July 23, 2022, 9:24am

Oh, it turns out that the shape of X is much different. The original one is (32, 5511, 101), but when i print(X.shape), it only printed (32, ). The thing that i’m confused is what is wrong? Because nearly my custom dataset has average of 1 second. Also, when i print for each x, it printed:

The values is either (101, 5511) or (101, 5998).

BryanEL · July 23, 2022, 11:22am

Can i know what software are you guys using to create the dataset? Because i tried everything, i tried having the same length as the activate dataset, i tried having almost the exact same sizes, but nothing works.

Elemento · July 23, 2022, 4:27pm

Hey @BryanEL,
The training samples are generated in the notebook only, as you must have seen already. Now, if we consider the raw audio samples, then I don’t think any specific software is required. You can use any audio recording software. If it supports the output files in .wav format, well in good, otherwise, you can use one of the many software to convert the extension from yours to .wav. Now, in the notebook, nothing has been mentioned regarding the specific recording software and/or the conversion software (if one is used), so, I don’t think it’s of much significance. Let me know if this helps.

Cheers,
Elemento

Topic		Replies	Views
C5 W3 A3 Trigger Word Detection Sequence Models coursera-platform	4	863	August 23, 2021
Programming Assignment: Trigger Word Detection Sequence Models week-module-3 , coursera-platform	4	197	May 9, 2024
Trying to run Trigger Word Detection in local machine but gives error Sequence Models coursera-platform	3	555	July 20, 2022
Course 5 - Week 3 - Trigger Word Detection : Training from Scratch Sequence Models coursera-platform	3	731	December 24, 2021
C5W3: Trigger word detection Sequence Models coursera-platform	1	410	September 22, 2023

Error in Custom Trigger Word Detection

Related topics