I appreciate the time you’ve both taken to respond thus far, but I’m still a little confused.
…at no point we are creating a model, then creating another model while
ignoring the supposedly previously created model.
Code cell 6 defines djmodel()
. Code cell 7 calls the djmodel()
function, which instantiates a model, and assigns the reference to that model in the variable called model
.
model = djmodel(Tx=30, LSTM_cell=LSTM_cell, densor=densor, reshaper=reshaper)
Code cells 10 through 12 compile and train model
.
Then in exercise two (code cell 14) we define music_inference_model()
, a function. In code cell 15, we call that function, instantiating a [second] model, and storing the reference to that newly instantiated model in the variable called inference_model
.
inference_model = music_inference_model(LSTM_cell, densor, Ty = 50)
By my count, we now have instantiated two models. First we instantiated and trained model
which was built and returned by the djmodel()
function, then we instantiated and did not train, but used for inference, inference model
, which was built and returned by the music_inference_model()
function.
I’m asking why we bothered to train the weights in one model, model
, then took those globally shared layers and used them to instantiate a second slightly different model, inference_model
?
In my head, it seems like after training the initial model, we could have immediately done something like this (in pseudo-code):
x = [initial_value]
ty = desired length
for t in range(ty):
predicted_next = model.predict(x)
process the prediction appropriately
x.append(predicted_next)
We give some initial value, have the model predict the next output, append that to the list of values, and just loop it until we have the number of musical values that we want. Where have I gone wrong? Is the ability to specify a new sequence length the only reason why we used music_inference_model()
to get a second model? Is needing to pre-specify the length prior to instantiating the model a limiting factor that requires this longer workaround?