TypeError in Course 3 Week 2 Exercise 5

I passed all previous 4 tests but I got an error in Exercise 5. The error seems to be related to the model in Exercise 4. There is a shape issue that I have not figured out. Help please!


LayerError Traceback (most recent call last)
in
4 model.init_from_file(‘model.pkl.gz’)
5 batch = next(data_generator(batch_size, max_length, lines, shuffle=False))
----> 6 preds = model(batch[0])
7 log_ppx = test_model(preds, batch[1])
8 print(‘The log perplexity and perplexity of your model are respectively’, log_ppx, np.exp(log_ppx))

/opt/conda/lib/python3.7/site-packages/trax/layers/base.py in call(self, x, weights, state, rng)
195 self.state = state # Needed if the model wasn’t fully initialized.
196 state = self.state
→ 197 outputs, new_state = self.pure_fn(x, weights, state, rng)
198 self.state = new_state
199 return outputs

/opt/conda/lib/python3.7/site-packages/trax/layers/base.py in pure_fn(self, x, weights, state, rng, use_cache)
604 name, trace = self._name, _short_traceback(skip=3)
605 raise LayerError(name, ‘pure_fn’,
→ 606 self._caller, signature(x), trace) from None
607
608 def output_signature(self, input_signature):

LayerError: Exception passing through layer Serial (in pure_fn):
layer created in file […]/, line 21
layer input shapes: ShapeDtype{shape:(32, 64), dtype:int32}

File […]/trax/layers/combinators.py, line 88, in forward
outputs, s = layer.pure_fn(inputs, w, s, rng, use_cache=True)

LayerError: Exception passing through layer Dense_256 (in pure_fn):
layer created in file […]/, line 20
layer input shapes: ShapeDtype{shape:(32, 64, 512), dtype:float32}

File […]/trax/layers/assert_shape.py, line 122, in forward_wrapper
y = forward(self, x, *args, **kwargs)

File […]/trax/layers/core.py, line 96, in forward
return jnp.dot(x, w) + b # Affine map.

File […]/_src/numpy/lax_numpy.py, line 4112, in dot
return lax.dot_general(a, b, (contract_dims, batch_dims), precision)

File […]/_src/lax/lax.py, line 702, in dot_general
preferred_element_type=preferred_element_type)

File […]/site-packages/jax/core.py, line 264, in bind
out = top_trace.process_primitive(self, tracers, params)

File […]/site-packages/jax/core.py, line 603, in process_primitive
return primitive.impl(*tracers, **params)

File […]/jax/interpreters/xla.py, line 248, in apply_primitive
compiled_fun = xla_primitive_callable(prim, *unsafe_map(arg_spec, args), **params)

File […]/jax/_src/util.py, line 186, in wrapper
return cached(config._trace_context(), *args, **kwargs)

File […]/jax/_src/util.py, line 179, in cached
return f(*args, **kwargs)

File […]/jax/interpreters/xla.py, line 272, in xla_primitive_callable
aval_out = prim.abstract_eval(*avals, **params)

File […]/_src/lax/lax.py, line 2125, in standard_abstract_eval
return ShapedArray(shape_rule(*avals, **kwargs), dtype_rule(*avals, **kwargs),

File […]/_src/lax/lax.py, line 3391, in _dot_general_shape_rule
raise TypeError(msg.format(lhs_contracting_shape, rhs_contracting_shape))

TypeError: dot_general requires contracting dimensions to have the same shape, got [512] and [1024].

I guess 512 is the depth of the embeddings, but have no idea where 1024 comes from. Here is the code for my model:

[code removed - moderator]

Please click my name and message your notebook as an attachment.

@balaji.ambresh - Any hints on the above, hit the same error

@Sanjay_Govindan
Please click my name and message your notebook as an attachment.

@Sanjay_Govindan
[MyClass()] * 10 will produce a list where the entire list will contain 10 references to a single MyClass object. This is not what we want. We want each entry to be a different object. For this, use [MyClass() for _ in range(10)].

Here’s an example of the first kind:

>>> l = [[1,2,3]] * 3
>>> l
[[1, 2, 3], [1, 2, 3], [1, 2, 3]]
>>> l[0].pop()
3
>>> l
[[1, 2], [1, 2], [1, 2]]
>>> 

This is the right approach for this assignment:

>>> l = [[1,2,3] for _ in range(3)]
>>> l
[[1, 2, 3], [1, 2, 3], [1, 2, 3]]
>>> l[0].pop()
3
>>> l
[[1, 2], [1, 2, 3], [1, 2, 3]]
>>> 

With this hint, please fix the model construction with respect to GRU layers.
Don’t forget to use the mode parameter for the GRU layers.

5 Likes

I really wish they would update the unit tests so these things can be caught earlier

1 Like

Hi @Mubsi , In Week 2, I am getting 2 small errors : in Exercise 4 , I am getting unexpected keyword argument ‘eval task’ and in exercise 5 , I am getting 3 tests Pass and 6 Fail. No clue why. Please help in this. I will be really thankful.

My lad id is - aoxrmxnwtnpy

Hi @Riddhima_Sobti,

In your Ex 4, I believe you made a typo. It is eval_tasks and not eval_task (plural). And you need to pass eval_task as a list.

For your Ex 5, as cautioned to you regarding using solution code that is not your own here, this is your code for Ex 5:

def test_model(preds, target):
    """Function to test the model.

    Args:
        preds (jax.interpreters.xla.DeviceArray): Predictions of a list of batches of tensors corresponding to lines of text.
        target (jax.interpreters.xla.DeviceArray): Actual list of batches of tensors corresponding to lines of text.

    Returns:
        float: log_perplexity of the model.
    """
    ### START CODE HERE ###

    total_log_ppx = ### YOUR CODE # HINT: tl.one_hot() should replace one of the Nones

    non_pad = ### YOUR CODE          # You should check if the target equals 0
    ppx = ### YOUR CODE                       # Get rid of the padding

    log_ppx = ### YOUR CODE
    
    ### END CODE HERE ###
    
    return -log_ppx

And this is the skeleton code we have provided for this exercise:

def test_model(preds, target):
    """Function to test the model.

    Args:
        preds (jax.interpreters.xla.DeviceArray): Predictions of a list of batches of tensors corresponding to lines of text.
        target (jax.interpreters.xla.DeviceArray): Actual list of batches of tensors corresponding to lines of text.

    Returns:
        float: log_perplexity of the model.
    """
    ### START CODE HERE ###

    log_p = np.sum(None * None, axis= -1) # HINT: tl.one_hot() should replace one of the Nones

    non_pad = 1.0 - np.equal(None, None)          # You should check if the target equals 0
    log_p = None * None                             # Get rid of the padding    
    
    log_ppx = np.sum(None, None) / np.sum(None, None) # Remember to set the axis properly when summing up
    log_ppx = np.mean(None) # Compute the mean of the previous expression
    
    
    ### END CODE HERE ###
    
    return -log_ppx

Do you see the difference ?

I’d suggest to get a fresh copy of the notebook by following the instructions and try again, and this time, pay close attention to all of the exercise instructions and hints provided in the notebook.

Best,
Mubsi

Yes, sure. Thank you. Will do so.