W1A3 jazz improvisation exercise 1 - implement djmodel()

In exercise 1 we are asked to implement the djmodel. I am getting the following error during the unit test.

---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-36-2ad560a25ebc> in <module>
      3 # UNIT TEST
      4 output = summary(model)
----> 5 comparator(output, djmodel_out)

~/work/W1A3/test_utils.py in comparator(learner, instructor)
     18 def comparator(learner, instructor):
     19     if len(learner) != len(instructor):
---> 20         raise AssertionError("Error in test. The lists contain a different number of elements")
     21     for index, a in enumerate(instructor):
     22         b = learner[index]

AssertionError: Error in test. The lists contain a different number of elements

I believe I might be making a mistake when slicing X. I’ve been using x = X[t,:] but I believe I might not be doing it right or considering the true shape of the input.

That error is telling you that the number of layers in your model is incorrect, so it’s probably something more fundamental than the slicing of X.

The cell after the one that just “threw” prints out your model but the format is a bit ornate. Here’s a cell that I added to my notebook to print the “summary” of the expected model and my model with the layers numbered to make it a bit easier to see what is going on:

print("Generated model:")
for index, a in enumerate(summary(model)):
    print(f"layer {index}: {a}")
print("Expected model:")
for index, a in enumerate(djmodel_out):
    print(f"layer {index}: {a}")

Thanks Paul, I’m seeing some of the expected layers in my generated model but in an incorrect order.

By looking at layer 1, I see that the shape of x is not what is expected. The notes indicate that the shape of the slice should be (n_values,) but I’m having difficulties putting that together by just slicing the tensor when the shape of X is (none, Tx, n_values). I suspect in other labs we used other methods to rearrange matrices but we’re working with tensors so I’m unsure if this is the right way. [Edited: by slicing X[:,t,:] I can produce the expected shape.]

The order of appearance of the layers is off and I am also unsure how to troubleshoot this.

Layer Generated Model Expected Model
0 [‘InputLayer’, [(None, 30, 90)], 0] [‘InputLayer’, [(None, 30, 90)], 0]
1 [‘TensorFlowOpLayer’, [(None, 90)], 0] [‘TensorFlowOpLayer’, [(None, 90)], 0]
2 [‘TensorFlowOpLayer’, [(None, 90)], 0] [‘Reshape’, (None, 1, 90), 0]
3 [‘TensorFlowOpLayer’, [(None, 90)], 0] [‘InputLayer’, [(None, 64)], 0]
4 [‘TensorFlowOpLayer’, [(None, 90)], 0] [‘InputLayer’, [(None, 64)], 0]
5 [‘TensorFlowOpLayer’, [(None, 90)], 0] [‘TensorFlowOpLayer’, [(None, 90)], 0]
6 [‘TensorFlowOpLayer’, [(None, 90)], 0] [‘LSTM’, [(None, 64), (None, 64), (None, 64)], 39680, [(None, 1, 90), (None, 64), (None, 64)], ‘tanh’]
7 [‘TensorFlowOpLayer’, [(None, 90)], 0] [‘TensorFlowOpLayer’, [(None, 90)], 0]
8 [‘TensorFlowOpLayer’, [(None, 90)], 0] [‘TensorFlowOpLayer’, [(None, 90)], 0]
9 [‘TensorFlowOpLayer’, [(None, 90)], 0] [‘TensorFlowOpLayer’, [(None, 90)], 0]
10 [‘TensorFlowOpLayer’, [(None, 90)], 0] [‘TensorFlowOpLayer’, [(None, 90)], 0]
11 [‘TensorFlowOpLayer’, [(None, 90)], 0] [‘TensorFlowOpLayer’, [(None, 90)], 0]
12 [‘TensorFlowOpLayer’, [(None, 90)], 0] [‘TensorFlowOpLayer’, [(None, 90)], 0]
13 [‘TensorFlowOpLayer’, [(None, 90)], 0] [‘TensorFlowOpLayer’, [(None, 90)], 0]
14 [‘TensorFlowOpLayer’, [(None, 90)], 0] [‘TensorFlowOpLayer’, [(None, 90)], 0]
15 [‘TensorFlowOpLayer’, [(None, 90)], 0] [‘TensorFlowOpLayer’, [(None, 90)], 0]
16 [‘TensorFlowOpLayer’, [(None, 90)], 0] [‘TensorFlowOpLayer’, [(None, 90)], 0]
17 [‘TensorFlowOpLayer’, [(None, 90)], 0] [‘TensorFlowOpLayer’, [(None, 90)], 0]
18 [‘TensorFlowOpLayer’, [(None, 90)], 0] [‘TensorFlowOpLayer’, [(None, 90)], 0]
19 [‘TensorFlowOpLayer’, [(None, 90)], 0] [‘TensorFlowOpLayer’, [(None, 90)], 0]
20 [‘TensorFlowOpLayer’, [(None, 90)], 0] [‘TensorFlowOpLayer’, [(None, 90)], 0]
21 [‘TensorFlowOpLayer’, [(None, 90)], 0] [‘TensorFlowOpLayer’, [(None, 90)], 0]
22 [‘TensorFlowOpLayer’, [(None, 90)], 0] [‘TensorFlowOpLayer’, [(None, 90)], 0]
23 [‘TensorFlowOpLayer’, [(None, 90)], 0] [‘TensorFlowOpLayer’, [(None, 90)], 0]
24 [‘TensorFlowOpLayer’, [(None, 90)], 0] [‘TensorFlowOpLayer’, [(None, 90)], 0]
25 [‘TensorFlowOpLayer’, [(None, 90)], 0] [‘TensorFlowOpLayer’, [(None, 90)], 0]
26 [‘TensorFlowOpLayer’, [(None, 90)], 0] [‘TensorFlowOpLayer’, [(None, 90)], 0]
27 [‘TensorFlowOpLayer’, [(None, 90)], 0] [‘TensorFlowOpLayer’, [(None, 90)], 0]
28 [‘TensorFlowOpLayer’, [(None, 90)], 0] [‘TensorFlowOpLayer’, [(None, 90)], 0]
29 [‘TensorFlowOpLayer’, [(None, 90)], 0] [‘TensorFlowOpLayer’, [(None, 90)], 0]
30 [‘TensorFlowOpLayer’, [(None, 90)], 0] [‘TensorFlowOpLayer’, [(None, 90)], 0]
31 [‘Reshape’, (None, 1, 90), 0] [‘TensorFlowOpLayer’, [(None, 90)], 0]
32 [‘InputLayer’, [(None, 64)], 0] [‘TensorFlowOpLayer’, [(None, 90)], 0]
33 [‘InputLayer’, [(None, 64)], 0] [‘TensorFlowOpLayer’, [(None, 90)], 0]
34 [‘LSTM’, [(None, 64), (None, 64), (None, 64)], 39680, [(None, 1, 90), (None, 64), (None, 64)], ‘tanh’] [‘TensorFlowOpLayer’, [(None, 90)], 0]
35 [‘Dense’, (None, 90), 5850, ‘softmax’] [‘Dense’, (None, 90), 5850, ‘softmax’]

Yes, it looks like you have the correct slicing by using the timestep dimension. The problem with the order of the layers means that your compute graph is somehow incorrect. The way the inputs and outputs are connected by the layer operations is somehow in the wrong order. I hope that you have not changed the fundamental structure of the “for” loop that they gave you in the template code. One thing to check is that you have the correct inputs and outputs on the LSTM invocation: at every timestep in the loop, the input is the state values from the previous timestep, not the initial state values, right?

I’m ruling out changes to the template after reloading the notebook. Therefore, I am looking at the LSTM call and realize that we need to use a and c instead of a0 and c0 because the a and c variables originally have the first assignment of a0 and c0 but will get updated. Thank you, Paul

It’s great to hear that you found the problem based on that suggestion. Onward! :nerd_face:

I’m getting an odd variation of this problem. While the X[:, t, ;] produces the correct dimensions in the various layers, instead of “TensorFlowOpLayer”, I’m getting layer type “SlicingOpLambda”. I’ve considered the possibility of using tf.slice instead, but the TensorFlow documentation that the python X[:, t, :] syntax is an acceptable syntax for this operation. It seems so simple that I’m wondering, “How can I miss?”

How are you “getting this layer type”? I don’t totally understand what you mean by that statement.

Can you give more details?

I’ve never seen that SlicingOpLambda layer type before. It is possible to implement that slicing using a Lambda function, but that will fail the unit tests. There should also be no need to use tf.slice. I wrote it using the plain vanilla X[:,t,:] python slicing syntax that you show and it worked for me. :nerd_face:

Thanks for the quick reply (on a weekend, no less)! Here’s the layer information, printed by a modification of Paul’s helpful “Generated model” / “Expected model” code above:

Generated model: Expected model:
layer 0: [‘InputLayer’, [(None, 30, 90)], 0] layer 0: [‘InputLayer’, [(None, 30, 90)], 0]
layer 1: [‘SlicingOpLambda’, (None, 90), 0] layer 1: [‘TensorFlowOpLayer’, [(None, 90)], 0]
layer 2: [‘Reshape’, (None, 1, 90), 0] layer 2: [‘Reshape’, (None, 1, 90), 0]
layer 3: [‘InputLayer’, [(None, 64)], 0] layer 3: [‘InputLayer’, [(None, 64)], 0]
layer 4: [‘InputLayer’, [(None, 64)], 0] layer 4: [‘InputLayer’, [(None, 64)], 0]
layer 5: [‘SlicingOpLambda’, (None, 90), 0] layer 5: [‘TensorFlowOpLayer’, [(None, 90)], 0]
layer 6: [‘LSTM’, [(None, 64), (None, 64), (None, 64)], 39680, [(None, 1, 90), (None, 64), (None, 64)], ‘tanh’] layer 6: [‘LSTM’, [(None, 64), (None, 64), (None, 64)], 39680, [(None, 1, 90), (None, 64), (None, 64)], ‘tanh’]
layer 7: [‘SlicingOpLambda’, (None, 90), 0] layer 7: [‘TensorFlowOpLayer’, [(None, 90)], 0]
layer 8: [‘SlicingOpLambda’, (None, 90), 0] layer 8: [‘TensorFlowOpLayer’, [(None, 90)], 0]
layer 9: [‘SlicingOpLambda’, (None, 90), 0] layer 9: [‘TensorFlowOpLayer’, [(None, 90)], 0]
layer 10: [‘SlicingOpLambda’, (None, 90), 0] layer 10: [‘TensorFlowOpLayer’, [(None, 90)], 0]
layer 11: [‘SlicingOpLambda’, (None, 90), 0] layer 11: [‘TensorFlowOpLayer’, [(None, 90)], 0]
layer 12: [‘SlicingOpLambda’, (None, 90), 0] layer 12: [‘TensorFlowOpLayer’, [(None, 90)], 0]
layer 13: [‘SlicingOpLambda’, (None, 90), 0] layer 13: [‘TensorFlowOpLayer’, [(None, 90)], 0]
layer 14: [‘SlicingOpLambda’, (None, 90), 0] layer 14: [‘TensorFlowOpLayer’, [(None, 90)], 0]
layer 15: [‘SlicingOpLambda’, (None, 90), 0] layer 15: [‘TensorFlowOpLayer’, [(None, 90)], 0]
layer 16: [‘SlicingOpLambda’, (None, 90), 0] layer 16: [‘TensorFlowOpLayer’, [(None, 90)], 0]
layer 17: [‘SlicingOpLambda’, (None, 90), 0] layer 17: [‘TensorFlowOpLayer’, [(None, 90)], 0]
layer 18: [‘SlicingOpLambda’, (None, 90), 0] layer 18: [‘TensorFlowOpLayer’, [(None, 90)], 0]
layer 19: [‘SlicingOpLambda’, (None, 90), 0] layer 19: [‘TensorFlowOpLayer’, [(None, 90)], 0]
layer 20: [‘SlicingOpLambda’, (None, 90), 0] layer 20: [‘TensorFlowOpLayer’, [(None, 90)], 0]
layer 21: [‘SlicingOpLambda’, (None, 90), 0] layer 21: [‘TensorFlowOpLayer’, [(None, 90)], 0]
layer 22: [‘SlicingOpLambda’, (None, 90), 0] layer 22: [‘TensorFlowOpLayer’, [(None, 90)], 0]
layer 23: [‘SlicingOpLambda’, (None, 90), 0] layer 23: [‘TensorFlowOpLayer’, [(None, 90)], 0]
layer 24: [‘SlicingOpLambda’, (None, 90), 0] layer 24: [‘TensorFlowOpLayer’, [(None, 90)], 0]
layer 25: [‘SlicingOpLambda’, (None, 90), 0] layer 25: [‘TensorFlowOpLayer’, [(None, 90)], 0]
layer 26: [‘SlicingOpLambda’, (None, 90), 0] layer 26: [‘TensorFlowOpLayer’, [(None, 90)], 0]
layer 27: [‘SlicingOpLambda’, (None, 90), 0] layer 27: [‘TensorFlowOpLayer’, [(None, 90)], 0]
layer 28: [‘SlicingOpLambda’, (None, 90), 0] layer 28: [‘TensorFlowOpLayer’, [(None, 90)], 0]
layer 29: [‘SlicingOpLambda’, (None, 90), 0] layer 29: [‘TensorFlowOpLayer’, [(None, 90)], 0]
layer 30: [‘SlicingOpLambda’, (None, 90), 0] layer 30: [‘TensorFlowOpLayer’, [(None, 90)], 0]
layer 31: [‘SlicingOpLambda’, (None, 90), 0] layer 31: [‘TensorFlowOpLayer’, [(None, 90)], 0]
layer 32: [‘SlicingOpLambda’, (None, 90), 0] layer 32: [‘TensorFlowOpLayer’, [(None, 90)], 0]
layer 33: [‘SlicingOpLambda’, (None, 90), 0] layer 33: [‘TensorFlowOpLayer’, [(None, 90)], 0]
layer 34: [‘SlicingOpLambda’, (None, 90), 0] layer 34: [‘TensorFlowOpLayer’, [(None, 90)], 0]
layer 35: [‘Dense’, (None, 90), 5850, ‘softmax’] layer 35: [‘Dense’, (None, 90), 5850, ‘softmax’]

The model fails the expects test with:

[‘TensorFlowOpLayer’, [(None, 90)], 0]

does not match the input value:

[‘SlicingOpLambda’, (None, 90), 0]

Also, when run locally on my Mac, errors are thrown (probably 30 times):

025-09-21 09:53:08.562142: I tensorflow/core/common_runtime/executor.cc:1197] [/device:CPU:0] (DEBUG INFO) Executor start aborting (this does not indicate an error and you can ignore this message): INVALID_ARGUMENT: You must feed a value for placeholder tensor ‘gradients/split_2_grad/concat/split_2/split_dim’ with dtype int32
[[{{node gradients/split_2_grad/concat/split_2/split_dim}}]]

And finally, the coup de grace: when I ran the same code again this morning in the Coursera ipynb, it now passes, whereas it was formerly displaying the [‘SlicingOpLambda’, (None, 90), 0] error above.

While I guess it’s possible that I didn’t really run the passing code on the ipynb before, I was pretty careful. I’m wondering if sometimes it’s good to restart the kernel when inexplicable errors occur.

Yes, that is a good strategy.

Especially in Course 5, because it uses a lot of global objects, and any of your code that modifies that object can invalidate your previous test results.

Thanks!

Thanks for showing us the better version of the printed output. So it seems like we’ll need to look at your actual code to figure out why the layer you get is different. In my code, it’s a simple assignment statement with x on the LHS and the python slicing expression shown above on the RHS and I get the same layer type that is shown in the “expected” value.

We aren’t supposed to share code publicly, but please check your DMs for a message from me about how to proceed. Well, assuming that my hint above doesn’t just clear things up. But even if it does get you to a solution, I’m genuinely curious to see how you implemented it. Knowing that will help us when questions of this sort come up in the future.

One other important point to make is that it is not a useful exercise to import the code to your local environment and run it. There are no meaningful conclusions that can be drawn from that. All that happens is what you saw: it fails with some “versionitis” issue. The assignments use the versions of TF and all the various support packages that were au courant as of April 2021. A lot has changed in 4 years and backwards compatibility of APIs just isn’t “a thing” in this space, sad to say. In fact, I think we’ve already discussed that point on a different thread of yours a couple of weeks ago.