Problem with Course 3 week 1 exercise 6: Incompatible shapes

maxhare · May 26, 2023, 7:56am

Hi,

I have problems with solving Exercise 6 due to some incompatible shapes. All previous tests in the notebook passed and all exemplary outputs match. Could you please help?

Here is the error stack:

---------------------------------------------------------------------------
LayerError                                Traceback (most recent call last)
<ipython-input-47-9216004d763d> in <module>
      2 # Take a look on how the eval_task is inside square brackets and
      3 # take that into account for you train_model implementation
----> 4 training_loop = train_model(model, train_task, [eval_task], 100, output_dir_expand)

<ipython-input-46-918bf0dfcb18> in train_model(classifier, train_task, eval_task, n_steps, output_dir)
     20                                 eval_tasks=eval_task, # The evaluation task
     21                                 output_dir=output_dir, # The output directory
---> 22                                 random_seed=31 # Do not modify this random seed in order to ensure reproducibility and for grading purposes.
     23     ) 
     24 

/opt/conda/lib/python3.7/site-packages/trax/supervised/training.py in __init__(self, model, tasks, eval_model, eval_tasks, output_dir, checkpoint_at, checkpoint_low_metric, checkpoint_high_metric, permanent_checkpoint_at, eval_at, which_task, n_devices, random_seed, loss_chunk_size, use_memory_efficient_trainer, adasum, callbacks)
    305     self._rjust_len = max(map(len, loss_names + metric_names))
    306     self._evaluator_per_task = tuple(
--> 307         self._init_evaluator(eval_task) for eval_task in self._eval_tasks)
    308 
    309     if self._output_dir is None:

/opt/conda/lib/python3.7/site-packages/trax/supervised/training.py in <genexpr>(.0)
    305     self._rjust_len = max(map(len, loss_names + metric_names))
    306     self._evaluator_per_task = tuple(
--> 307         self._init_evaluator(eval_task) for eval_task in self._eval_tasks)
    308 
    309     if self._output_dir is None:

/opt/conda/lib/python3.7/site-packages/trax/supervised/training.py in _init_evaluator(self, eval_task)
    364     """Initializes the per-task evaluator."""
    365     model_with_metrics = _model_with_metrics(
--> 366         self._eval_model, eval_task)
    367     if self._use_memory_efficient_trainer:
    368       return _Evaluator(

/opt/conda/lib/python3.7/site-packages/trax/supervised/training.py in _model_with_metrics(model, eval_task)
   1047   """
   1048   return _model_with_ends(
-> 1049       model, eval_task.metrics, shapes.signature(eval_task.sample_batch)
   1050   )
   1051 

/opt/conda/lib/python3.7/site-packages/trax/supervised/training.py in _model_with_ends(model, end_layers, batch_signature)
   1028   # TODO(jonni): Redo this function as part of an initialization refactor?
   1029   metrics_layer = tl.Branch(*end_layers)
-> 1030   metrics_input_signature = model.output_signature(batch_signature)
   1031   _, _ = metrics_layer.init(metrics_input_signature)
   1032 

/opt/conda/lib/python3.7/site-packages/trax/layers/base.py in output_signature(self, input_signature)
    608   def output_signature(self, input_signature):
    609     """Returns output signature this layer would give for `input_signature`."""
--> 610     return self._forward_abstract(input_signature)[0]  # output only, not state
    611 
    612   def _forward_abstract(self, input_signature):

/opt/conda/lib/python3.7/site-packages/trax/layers/base.py in _forward_abstract(self, input_signature)
    640       name, trace = self._name, _short_traceback(skip=7)
    641       raise LayerError(name, '_forward_abstract', self._caller, input_signature,
--> 642                        trace) from None
    643 
    644   # pylint: disable=protected-access

LayerError: Exception passing through layer Serial (in _forward_abstract):
  layer created in file [...]/<ipython-input-31-18a755dca760>, line 29
  layer input shapes: (ShapeDtype{shape:(16, 13), dtype:int32}, ShapeDtype{shape:(16,), dtype:int32}, ShapeDtype{shape:(16,), dtype:int32})

  File [...]/jax/interpreters/partial_eval.py, line 411, in abstract_eval_fun
    lu.wrap_init(fun, params), avals, debug_info)

  File [...]/jax/interpreters/partial_eval.py, line 1252, in trace_to_jaxpr_dynamic
    jaxpr, out_avals, consts = trace_to_subjaxpr_dynamic(fun, main, in_avals)

  File [...]/jax/interpreters/partial_eval.py, line 1262, in trace_to_subjaxpr_dynamic
    ans = fun.call_wrapped(*in_tracers)

  File [...]/site-packages/jax/linear_util.py, line 166, in call_wrapped
    ans = self.f(*args, **dict(self.params, **kwargs))

  File [...]/site-packages/jax/linear_util.py, line 166, in call_wrapped
    ans = self.f(*args, **dict(self.params, **kwargs))

LayerError: Exception passing through layer Serial (in pure_fn):
  layer created in file [...]/<ipython-input-31-18a755dca760>, line 29
  layer input shapes: (ShapeDtype{shape:(16, 13), dtype:int32}, ShapeDtype{shape:(16,), dtype:int32}, ShapeDtype{shape:(16,), dtype:int32})

  File [...]/trax/layers/combinators.py, line 88, in forward
    outputs, s = layer.pure_fn(inputs, w, s, rng, use_cache=True)

LayerError: Exception passing through layer Dense_2 (in pure_fn):
  layer created in file [...]/<ipython-input-31-18a755dca760>, line 17
  layer input shapes: ShapeDtype{shape:(16, 13), dtype:float32}

  File [...]/trax/layers/assert_shape.py, line 122, in forward_wrapper
    y = forward(self, x, *args, **kwargs)

  File [...]/trax/layers/core.py, line 96, in forward
    return jnp.dot(x, w) + b  # Affine map.

  File [...]/_src/numpy/lax_numpy.py, line 4105, in dot
    return lax.dot(a, b, precision=precision)

  File [...]/_src/lax/lax.py, line 664, in dot
    lhs.shape, rhs.shape))

TypeError: Incompatible shapes for dot: got (16, 13) and (15, 2).

arvyzukai · May 26, 2023, 9:02am

Hi @maxhare

Have you specified the axis argument in the tl.Mean() when creating the Model in Ex. 5?

maxhare · May 26, 2023, 1:02pm

Hi @arvyzukai

No, I use the default, i.e., axis=-1. What should I use instead?

maxhare · May 26, 2023, 1:07pm

Ah okay, needs to be axis = 1. What is axis = -1 here? batches?

arvyzukai · May 26, 2023, 1:58pm

axis=-1 - rightmost axis - average over embeddings
axis=1 - middle axis - average over sequence
axis=0 - leftmost axis - average over batch dimension

One thing to note is that understanding tensors (what they represent) and their dimensions is a crucial part of learning.

For example, if you have a tensor of shape (8, 32, 128), knowing what each dimension represent is very important (activation functions, optimizers, etc. do not matter if you don’t know what happens with your information (the “thing” representing the dataset for your model).

As you continue the course, this (understanding the shapes) will get even more important with more complicated models.

Cheers

maxhare · May 26, 2023, 3:52pm

Thank you for your fast reply.

xiaoyin_liu · July 6, 2023, 1:21am

I got the exact same error and have it answered here. Thanks a lot!!

Nate_Blaylock · October 28, 2023, 12:58am

thank you!
It would be nice if this were checked in the unit test for the code block it’s part of.

Topic		Replies	Views
Course 3 week 1 exercise 6 NLP with Sequence Models week-module-1	44	1212	February 3, 2024
Week 1 Exercise 6-- start indices must have integer type NLP with Sequence Models week-module-1	5	567	March 11, 2023
Course 3 week 1 exercise 6: train model NLP with Sequence Models week-module-1	3	563	March 6, 2023
Week 1 assignment exercise 06 NLP with Sequence Models week-module-1	1	362	October 19, 2023
UNQ_C6: TypeError: mul got incompatible shapes for broadcasting: (2000, 2), (16, 2) NLP with Sequence Models week-module-1	5	487	April 13, 2023

Problem with Course 3 week 1 exercise 6: Incompatible shapes

Related topics