C5W1A3 - Exercise 3 - predict() result is the wrong size

sheff · January 21, 2024, 12:05am

In the predict_and_sample() function from Exercise 3, the instructions say to expect the following when calling inference_model.predict():

The output pred should be a list of length 𝑇𝑦 where each element is a numpy-array of shape (1, n_values).

When I run, pred is a list of length 2 where each element has shape (1, 64). I am expecting a list of length 50 where each element has a shape of (1, 90).

inference_model is created with Ty=50. The parameters to predict() seem straightforward. I have restarted the notebook and run each cell. All unit tests before this one pass.

I have spent hours trying to debug this with no luck. What is going wrong?

paulinpaloalto · January 21, 2024, 12:50am

If your previous code all passes the test cases, then whatever the problem is must be in predict_and_sample, of course. There aren’t really that many moving parts here: most of the action is happening in the model. How does 64 enter into anything in terms of results? That is the “hidden state” size, right? So we should see output which is (Ty, 90), since 90 is the number of possible output notes. Maybe you supplied the initializer arguments to the model in the wrong order or something like that?

I added print statements in my predict_and_sample code to see what is going on and here’s what I get:

len(pred) = 50
type(pred) = <class 'list'>
indices.shape = (50, 1)
results.shape = (50, 90)
np.argmax(results[12]) = 12
np.argmax(results[17]) = 18
list(indices[12:18]) = [array([12]), array([34]), array([55]), array([50]), array([13]), array([18])]

My first suggestion just based on general principles would be to take a few deep cleansing breaths and with a calm mind carefully read through the instructions for that function again and compare what they say to what your code actually does.

sheff · January 21, 2024, 2:35am

Thanks for the response paulinpaloalto. I agree that there aren’t many moving parts which is why the output I am getting baffles me. And yes, deep cleansing breaths help!

So essentially the problem is that the call to inference_model.predict() is producing incorrect results. The instructions for this call:

Use your inference model to predict an output given your set of inputs. The output pred should be a list of length 𝑇𝑦 where each element is a numpy-array of shape (1, n_values).
inference_model.predict([input_x_init, hidden_state_init, cell_state_init])
Choose the appropriate input arguments to predict from the input arguments of this predict_and_sample function.

predict() is expecting one argument, a list, that contains three arguments. Per the instructions, these arguments are to be selected from the inputs arguments of predict_and_sample(). This gives us four options:

inference_model
x_initializer
a_initializer
c_initializer

Let’s rule out inference_model as an argument to inference_model.predict(). That leaves three arguments. In the spirit of being comprehensive, I tried all six possible orderings of the _initializer arguments as input to predict() with the following results:

inference_model.predict([a_initializer, c_initializer, x_initializer]) ERROR, NO RESULTS PRODUCED
inference_model.predict([a_initializer, x_initializer, c_initializer]) ERROR, NO RESULTS PRODUCED
inference_model.predict([c_initializer, a_initializer, x_initializer]) ERROR, NO RESULTS PRODUCED
inference_model.predict([c_initializer, x_initializer, a_initializer]) ERROR, NO RESULTS PRODUCED
inference_model.predict([x_initializer, a_initializer, c_initializer]) UNEXPECTED DIMENSIONS FOR pred
inference_model.predict([x_initializer, c_initializer, a_initializer]) UNEXPECTED DIMENSIONS FOR pred

The error given was similar to the below in all 4 cases:

WARNING:tensorflow:Model was constructed with shape (None, 1, 90) for input Tensor("input_2:0", shape=(None, 1, 90), dtype=float32), but it was called on an input with incompatible shape (None, 64).
WARNING:tensorflow:Model was constructed with shape (None, 64) for input Tensor("a0_1:0", shape=(None, 64), dtype=float32), but it was called on an input with incompatible shape (None, 1, 90).
<clip>
ValueError: Input 0 of layer lstm is incompatible with the layer: expected ndim=3, found ndim=2. Full shape received: [None, 64]

In the 2 cases where pred did not have the expected dimensions, the output was identical and looked like the following (I also added print statements similar to yours. ):

len(pred): 2 -- should be 50
type(pred): <class 'list'>
pred[0].shape: (1, 64) -- should be (1, 90)
type(pred[0]): <class 'numpy.ndarray'>
indices.shape: (2, 1) -- should be (50, 1)
results.shape: (2, 90) -- should be (50, 90)
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-33-0bfb6bdfdaab> in <module>
      3 results, indices = predict_and_sample(inference_model, x_initializer, a_initializer, c_initializer)
      4 
----> 5 print("np.argmax(results[12]) =", np.argmax(results[12]))
      6 print("np.argmax(results[17]) =", np.argmax(results[17]))
      7 print("list(indices[12:18]) =", list(indices[12:18]))

IndexError: index 12 is out of bounds for axis 0 with size 2

So it appears that no possible combination of inputs specified by the instructions produces the expected values for pred.

I’m not sure what I’m missing but clearly I’m missing something. Again, I restarted the kernel, cleared all output, ran all cells, and all unit tests pass.

At least I can click on later cells and it actually generates music. It is kind of short and sad, though

paulinpaloalto · January 21, 2024, 2:42am

Well, maybe it’s time to just look at your code. We aren’t supposed to do that in a public way, but there is a private method. I’ll send you a DM about how to proceed with that.

sheff · January 21, 2024, 3:04am

Thanks for your help @paulinpaloalto . I found the source of the problem to be an incorrect definition of the inference model that made it through unit tests undetected.

paulinpaloalto · January 21, 2024, 3:10am

That’s great news. Nice work. I will take a look and hope that I can file a bug about how to enhance the unit tests to catch that mistake.

Onward!

Topic		Replies	Views
Exercise 3 of DLS C5, week 1 assigned programming 3 Sequence Models week-1	5	35	November 25, 2024
Jazz improvisation notebook: issue with predict_and_sample Sequence Models	5	998	July 8, 2021
W1 A3 E3 predict_and_sample Sequence Models	1	566	April 1, 2022
Stuck on predict_and_sample(), Q3 in Jazz Improvisation with LSTM Sequence Models	5	669	January 3, 2022
Course 5 Week 1 Assignment 3 Exercise 3 "predict_and_sample" Sequence Models	2	540	September 16, 2022

C5W1A3 - Exercise 3 - predict() result is the wrong size

Related topics