Problem with Course 3 week 1 exercise 6: Incompatible shapes

Hi,

I have problems with solving Exercise 6 due to some incompatible shapes. All previous tests in the notebook passed and all exemplary outputs match. Could you please help?

Here is the error stack:

---------------------------------------------------------------------------
LayerError                                Traceback (most recent call last)
<ipython-input-47-9216004d763d> in <module>
      2 # Take a look on how the eval_task is inside square brackets and
      3 # take that into account for you train_model implementation
----> 4 training_loop = train_model(model, train_task, [eval_task], 100, output_dir_expand)

<ipython-input-46-918bf0dfcb18> in train_model(classifier, train_task, eval_task, n_steps, output_dir)
     20                                 eval_tasks=eval_task, # The evaluation task
     21                                 output_dir=output_dir, # The output directory
---> 22                                 random_seed=31 # Do not modify this random seed in order to ensure reproducibility and for grading purposes.
     23     ) 
     24 

/opt/conda/lib/python3.7/site-packages/trax/supervised/training.py in __init__(self, model, tasks, eval_model, eval_tasks, output_dir, checkpoint_at, checkpoint_low_metric, checkpoint_high_metric, permanent_checkpoint_at, eval_at, which_task, n_devices, random_seed, loss_chunk_size, use_memory_efficient_trainer, adasum, callbacks)
    305     self._rjust_len = max(map(len, loss_names + metric_names))
    306     self._evaluator_per_task = tuple(
--> 307         self._init_evaluator(eval_task) for eval_task in self._eval_tasks)
    308 
    309     if self._output_dir is None:

/opt/conda/lib/python3.7/site-packages/trax/supervised/training.py in <genexpr>(.0)
    305     self._rjust_len = max(map(len, loss_names + metric_names))
    306     self._evaluator_per_task = tuple(
--> 307         self._init_evaluator(eval_task) for eval_task in self._eval_tasks)
    308 
    309     if self._output_dir is None:

/opt/conda/lib/python3.7/site-packages/trax/supervised/training.py in _init_evaluator(self, eval_task)
    364     """Initializes the per-task evaluator."""
    365     model_with_metrics = _model_with_metrics(
--> 366         self._eval_model, eval_task)
    367     if self._use_memory_efficient_trainer:
    368       return _Evaluator(

/opt/conda/lib/python3.7/site-packages/trax/supervised/training.py in _model_with_metrics(model, eval_task)
   1047   """
   1048   return _model_with_ends(
-> 1049       model, eval_task.metrics, shapes.signature(eval_task.sample_batch)
   1050   )
   1051 

/opt/conda/lib/python3.7/site-packages/trax/supervised/training.py in _model_with_ends(model, end_layers, batch_signature)
   1028   # TODO(jonni): Redo this function as part of an initialization refactor?
   1029   metrics_layer = tl.Branch(*end_layers)
-> 1030   metrics_input_signature = model.output_signature(batch_signature)
   1031   _, _ = metrics_layer.init(metrics_input_signature)
   1032 

/opt/conda/lib/python3.7/site-packages/trax/layers/base.py in output_signature(self, input_signature)
    608   def output_signature(self, input_signature):
    609     """Returns output signature this layer would give for `input_signature`."""
--> 610     return self._forward_abstract(input_signature)[0]  # output only, not state
    611 
    612   def _forward_abstract(self, input_signature):

/opt/conda/lib/python3.7/site-packages/trax/layers/base.py in _forward_abstract(self, input_signature)
    640       name, trace = self._name, _short_traceback(skip=7)
    641       raise LayerError(name, '_forward_abstract', self._caller, input_signature,
--> 642                        trace) from None
    643 
    644   # pylint: disable=protected-access

LayerError: Exception passing through layer Serial (in _forward_abstract):
  layer created in file [...]/<ipython-input-31-18a755dca760>, line 29
  layer input shapes: (ShapeDtype{shape:(16, 13), dtype:int32}, ShapeDtype{shape:(16,), dtype:int32}, ShapeDtype{shape:(16,), dtype:int32})

  File [...]/jax/interpreters/partial_eval.py, line 411, in abstract_eval_fun
    lu.wrap_init(fun, params), avals, debug_info)

  File [...]/jax/interpreters/partial_eval.py, line 1252, in trace_to_jaxpr_dynamic
    jaxpr, out_avals, consts = trace_to_subjaxpr_dynamic(fun, main, in_avals)

  File [...]/jax/interpreters/partial_eval.py, line 1262, in trace_to_subjaxpr_dynamic
    ans = fun.call_wrapped(*in_tracers)

  File [...]/site-packages/jax/linear_util.py, line 166, in call_wrapped
    ans = self.f(*args, **dict(self.params, **kwargs))

  File [...]/site-packages/jax/linear_util.py, line 166, in call_wrapped
    ans = self.f(*args, **dict(self.params, **kwargs))

LayerError: Exception passing through layer Serial (in pure_fn):
  layer created in file [...]/<ipython-input-31-18a755dca760>, line 29
  layer input shapes: (ShapeDtype{shape:(16, 13), dtype:int32}, ShapeDtype{shape:(16,), dtype:int32}, ShapeDtype{shape:(16,), dtype:int32})

  File [...]/trax/layers/combinators.py, line 88, in forward
    outputs, s = layer.pure_fn(inputs, w, s, rng, use_cache=True)

LayerError: Exception passing through layer Dense_2 (in pure_fn):
  layer created in file [...]/<ipython-input-31-18a755dca760>, line 17
  layer input shapes: ShapeDtype{shape:(16, 13), dtype:float32}

  File [...]/trax/layers/assert_shape.py, line 122, in forward_wrapper
    y = forward(self, x, *args, **kwargs)

  File [...]/trax/layers/core.py, line 96, in forward
    return jnp.dot(x, w) + b  # Affine map.

  File [...]/_src/numpy/lax_numpy.py, line 4105, in dot
    return lax.dot(a, b, precision=precision)

  File [...]/_src/lax/lax.py, line 664, in dot
    lhs.shape, rhs.shape))

TypeError: Incompatible shapes for dot: got (16, 13) and (15, 2).


1 Like

Hi @maxhare

Have you specified the axis argument in the tl.Mean() when creating the Model in Ex. 5?

1 Like

Hi @arvyzukai

No, I use the default, i.e., axis=-1. What should I use instead?

Ah okay, needs to be axis = 1. What is axis = -1 here? batches?

2 Likes
  • axis=-1 - rightmost axis - average over embeddings
  • axis=1 - middle axis - average over sequence
  • axis=0 - leftmost axis - average over batch dimension

One thing to note is that understanding tensors (what they represent) and their dimensions is a crucial part of learning.

For example, if you have a tensor of shape (8, 32, 128), knowing what each dimension represent is very important (activation functions, optimizers, etc. do not matter if you don’t know what happens with your information (the “thing” representing the dataset for your model).

As you continue the course, this (understanding the shapes) will get even more important with more complicated models.

Cheers

1 Like

Thank you for your fast reply.

I got the exact same error and have it answered here. Thanks a lot!!

thank you!
It would be nice if this were checked in the unit test for the code block it’s part of.

1 Like