I just completed my week 1’s third assignment but I am not sure what I did. The LSTM model outputs values from 1 to 90 (if I got that right) at each time step but what does those values means? And why at inference time, we loop through t to Ty but in training, we loop through t to Tx? And why do we make another model to make inference and not use the same model which we used to train?
There are also some other things which are unclear to me, but the above mentioned are the most important ones for me for now.
Thank you fror reading. I hope to get convincing answers.
The values 1 to 90 are the notes in the music sequence.
There are separate models for training and prediction, because we want to use a prediction model that has some random variation in the outputs we select. That lets each run of the model create different music.
If we used the same model for training and prediction, we would always select the maximum prediction for each note, so the music would always be the same.
Thank you replying.
For the first part, what I understood is that just like we have some number of characters (character level), in the same way, we have some number of notes which can be represented as Tx number of notes. Correct me if I am wrong.
For the second part, I still do not get it. Why would the model generate same music if we used it for training and inference.? Would you be so kind to explain in detail?
Because the predictions that are provided with the training model will always select the output that has the highest value. That would give use the same output every time.
We can only get weighted random outputs if we define a new model but only use it for predictions.