after finishing the course, one question is still bugging me:
in a lot of real life situations, such as for speech recognition, TTS, etc.
we do not know Tx in advance, or rather it is highly variable.
For example the length of an input sentence can be very short or very long, in a sense even maxLen_Tx in unknown.
However in order to build the model, we have to know Tx, or maxLen of Tx. What to do if we don’t have such information? What happens if the input is longer than the arbitary maxLen we have fixed?
Max Tx is a parameter you set, based on your knowledge of the problem you’re solving.
If you get some examples where Tx is larger than your designed maximum, then you either have to truncate the example, or you can change the Max value and train the whole system again.
thanks for your answer.
So basically we can split the long sample into multiple smaller samples right? and parse the splited data to the model, similar to the “sliding window” for images.
Changing the the max Tx and retrain the whole model should fix the problem, but isn’t it not time efficient, the trained weights are lost.