- See this link to notice that you’ll need the speaker to complete the sentence before using a BRNN to transcribe the voice input.
- You can use beam search for this as well. Do look at other sources to see if there’s a metric for voice to text task.
- Assuming we’re talking about machine translation using RNNs, provide previous hidden state and current word as input to predict the next word. Previous hidden state is all 0s for a<0>. This is the general mode of operation of an RNN. The encoder learns the entire representation of the input sentence before emitting the translated sentence.