Week 3 assignment 2 speech to text labeling question


In lesson we saw that we would add ‘1’ a few times after the trigger word ‘attention’. Expanding on this concept, if I wanted to process more general audios, is it correct to say that the labeling of the training data would be to mark with an index each word of the audio file?

For instance, if the vocabulary is “I”, “AM”, “JANE”, and a dictionary would be, for instance, (1:“AM”), (2:“I”), (3:“JANE”), … the audio clip “I AM JANE” would be labeled:

… I … AM … JANE…

Or labeling like this is only needed for trigger word cases?



Hello Juan.

Are you pointing towards labeling of the positive targets? Your query is unclear. Please clarify what are you actually asking for.

Well, there is a broad explanation given in the instructions if you can go through it once again.