In the Jazz assignment, we are training the LSTM and Dense layers and using these layers to generate a sequence of values. And the inference_model takes inputs x0,a0 and c0. When we are finally using the inference model to predict a sequence, we are initializing x0, a0 and c0 to zeros.
- Will the sequences that are generated always be the same? Because in the inference model we are alwyas taking argmax in the softmax output.
- In the inference_model we are using initial state of a0 and c0 for the LSTM cell. Doesn’t our LSTM_cell already learn all activations and memory cells?