How will our parameters work?

In course 5 (RNN) I guess we try to create one (or more than one does not matter) parameter set. Assume that we use 100000 hours of speech audio (to transcripted) or we have 100000 translation (to translated) training set. Then our parameters has been created by our model. If there is 10 second audio and we want to transcripted it how will our parameter set work? I have understood the concepts but still I could not imaginate in my mind.

Consider the example below for your understanding.
Let’s say you have trained your RNN model on 100k hours dataset keeping the maximum length of input to be 15 seconds for the model. Any audio file greater than 15 seconds will be truncated to 15 seconds and audio files less than 15 seconds will be padded (there is a padding token) to make them of 15 seconds by the audio pre-processing function (you need to implement this function) before passing them to the model for training.

After training the model when you pass 10 seconds audio file for transcription, the pre-processing function will pad the audio file to make it 15 seconds in duration and then the model will perform transcription of this padded audio file.