N_values = 90 or 78? (Improvise_a_Jazz_Solo_with_an_LSTM_Network_v4)

Hello,
In assignment “Improvise_a_Jazz_Solo_with_an_LSTM_Network_v4”, the code cell:

X, Y, n_values, indices_values, chords = load_music_utils(‘data/original_metheny.mid’)

has the output shown below. Specifically, n_values is 90:

number of training examples: 60
Tx (length of sequence): 30
total # of unique values: 90
shape of X: (60, 30, 90)
Shape of Y: (30, 60, 90)
Number of chords 19

However, when I run the same assignment in Google Colab, I get a different value i.e. 78 for n_values:

number of training examples: 60
Tx (length of sequence): 30
total # of unique values: 78
shape of X: (60, 30, 78)
Shape of Y: (30, 60, 78)
Number of chords 19

I installed mido v1.2.9 and pydub v0.24.0 in my Colab notebook, to match the versions that are installed in the DLS notebook.

The functions that return the above metadata are load_music_utils(), get_musical_data(), get_corpus_data(), and __get_abstract_grammars(). I was browsing through that code and did not see anything obvious that could account for the discrepancy in total # of unique values.

Can you help me understand what is the reason behind this?

I can run most of my Colab notebook without any issues. The model compile and train code runs as expected.

The problem is the last step that generates new music:

out_stream = generate_music(inference_model, indices_values, chords)

The call to generate_music() throws this error:

ValueError: Input 0 is incompatible with layer model_1: expected shape=(None, 1, 78), found shape=(None, 1, 90)

Can you help me figure this out?

I finally fixed this error, but it took some ugly hacks to the code. This is a summary of what I did:

  1. Comment out all cells in the notebook that set explicitly set n_values to 90.

  2. Ignore the global vars reshaper, LSTM_cell, and densor. One way to do this is to pass an extra “music_values” arg to function djmodel and init these inside the function body, and then return them along with the create model:

def djmodel(Tx, n_a, music_values):
reshaper = Reshape((1, music_values)) # Used in Step 2.B of djmodel(), below
LSTM_cell = LSTM(n_a, return_state = True) # Used in Step 2.C
densor = Dense(music_values, activation=‘softmax’) # Used in Step 2.D

return model, reshaper, LSTM_cell, densor

  1. In the next cell, change the call to djmodel() as shown below, this will force subsequent cells to use the reshaper, LSTM_cell, and densor to use the objects created inside djmodel(), and not the global vars:

model, reshaper, LSTM_cell, densor = djmodel(Tx=30, n_a=n_a, music_values=78)

  1. Set n_values to 78 before using it in x_initializer, a_initializer, and
    c_initializer:

n_values = 78

x_initializer = np.zeros((1, 1, n_values))

a_initializer = np.zeros((1, n_a))

c_initializer = np.zeros((1, n_a))

  1. In data_utils.py, set x_initializer = np.zeros((1, 1, 78)). This is necessary since generate_music() calls predict_and_sample() with the initializer vars.

That’s about it. With these above changes, I ran this assignment in Google Colab without any problem. The my_music.midi sounds quite decent after I convert it to mp3 and play it.

Why there is a mismatch between n_values in the Couresera notebook and Google Colab is still a mystery.

It looks like explicitly setting n_values = 90 in the python notebook and data_utils.py is the problem.

I removed all notebook cells that set n_values = 90 and set x_initializer = np.zeros((1, 1, 78)) in data_utils.py. I also rolled back my modifications to djmodel() i.e. its method signature is now back to def djmodel(Tx, LSTM_cell, densor, reshaper) and it only returns the model object.

With these changes, my Colab notebook runs without error and the generated my_music.midi file sounds as I expect it to.

The explicit setting of x_initializer = np.zeros((1, 1, 78)) in data_utils.py shouldn’t be required if generate_music() is modified to accept x_initializer, a_initializer, and c_initializer as arguments. This will allow generate_music() to pass them on when it calls predict_and_sample(inference_model).

Can the mentors confirm whether these changes work in the Coursera notebook as well?

1 Like

Just had this exact same issue: I could solve it by ensuring the version of music21 installed matched the notebook - i.e. version 6.5.0.

My guess is that whatever magic this library is doing to slice and dice the music into a fixed number of discrete ‘notes’ (i.e. a finite number of different pitches and durations) changes from version to version because the underlying algorithm changes. This would mean the number of notes from original_metheny.mid varies depending on version install. It’s obviously a bit of a volatile library as the code breaks entirely if you use the today-current version of 7.1.0!

1 Like