How does the model know whether the person said “activate” or someother word , like how is it differentiating other words we are just feeding him numbers as X
Please see the section Insert ones for the labels of the positive target
.
ys start off as 0 are are set to 1s for 50 steps starting from end of the activation word. Keeping in mind that a 10 second duration of inputs produces 1375 outputs as \hat{y}, We compute segment_end_y
before setting the corresponding outputs as 1.