C3W1 assignment exercise 6: incompatible shapes for broadcasting: (8, 2), (40, 2)

sunnyjooey · December 19, 2021, 6:32am

I’m having issues with exercise 6train_model in the assignment.
I’m passing all tests until exercise 6, but I’m getting the error below regarding data shape.
This is after implementing tl.Mean(axis=1). Anyone figured this out?

LayerError: Exception passing through layer Serial (in pure_fn):
layer created in file […]/trax/supervised/training.py, line 1033
layer input shapes: (ShapeDtype{shape:(8, 13), dtype:int32}, ShapeDtype{shape:(40,), dtype:int32}, ShapeDtype{shape:(16,), dtype:int32})

File […]/trax/layers/combinators.py, line 88, in forward
outputs, s = layer.pure_fn(inputs, w, s, rng, use_cache=True)

LayerError: Exception passing through layer CrossEntropyLoss (in pure_fn):
layer created in file […]/, line 12
layer input shapes: (ShapeDtype{shape:(8, 2), dtype:float32}, ShapeDtype{shape:(40,), dtype:int32}, ShapeDtype{shape:(16,), dtype:int32})

File […]/trax/layers/combinators.py, line 88, in forward
outputs, s = layer.pure_fn(inputs, w, s, rng, use_cache=True)

LayerError: Exception passing through layer _CrossEntropy (in pure_fn):
layer created in file […]/, line 12
layer input shapes: (ShapeDtype{shape:(8, 2), dtype:float32}, ShapeDtype{shape:(40,), dtype:int32})

File […]/trax/layers/base.py, line 743, in forward
raw_output = self._forward_fn(inputs)

File […]/trax/layers/base.py, line 784, in _forward
return f(*xs)

File […]/trax/layers/metrics.py, line 581, in f
return -1.0 * jnp.sum(model_output * target_distribution, axis=-1)

File […]/site-packages/jax/core.py, line 506, in mul
def mul(self, other): return self.aval._mul(self, other)

File […]/_src/numpy/lax_numpy.py, line 5819, in deferring_binary_op
return binary_op(self, other)

File […]/src/numpy/lax_numpy.py, line 431, in fn
return lax_fn(x1, x2) if x1.dtype != bool else bool_lax_fn(x1, x2)

File […]/_src/lax/lax.py, line 348, in mul
return mul_p.bind(x, y)

File […]/site-packages/jax/core.py, line 264, in bind
out = top_trace.process_primitive(self, tracers, params)

File […]/jax/interpreters/ad.py, line 274, in process_primitive
primal_out, tangent_out = jvp(primals_in, tangents_in, **params)

File […]/jax/interpreters/ad.py, line 449, in standard_jvp
val_out = primitive.bind(*primals, **params)

File […]/site-packages/jax/core.py, line 264, in bind
out = top_trace.process_primitive(self, tracers, params)

File […]/jax/interpreters/partial_eval.py, line 1059, in process_primitive
out_avals = primitive.abstract_eval(*avals, **params)

File […]/_src/lax/lax.py, line 2125, in standard_abstract_eval
return ShapedArray(shape_rule(*avals, **kwargs), dtype_rule(*avals, **kwargs),

File […]/_src/lax/lax.py, line 2221, in _broadcasting_shape_rule
raise TypeError(msg.format(name, ', '.join(map(str, map(tuple, shapes)))))

TypeError: mul got incompatible shapes for broadcasting: (8, 2), (40, 2).

c.godawat · December 20, 2021, 5:19am

Hi @sunnyjooey

Thanks for reaching out.

Can you share the code where you are getting this error?

Also, as per seeing the error, I assume that mul refers to some multiplication operation. So in that case you should have the dimensions of shape (8, 2) and (2, 40) [not (40, 2)]. You can try this out. Although as I am not aware of the could yet, this is my first guess, and might not be true.

So if above doesn’t work, request you to share the code so that I can help you better.

sunnyjooey · December 20, 2021, 11:29am

This is my train_model:

# UNQ_C6 (UNIQUE CELL IDENTIFIER, DO NOT EDIT)
# GRADED FUNCTION: train_model
def train_model(classifier, train_task, eval_task, n_steps, output_dir):
    '''
    Input: 
        classifier - the model you are building
        train_task - Training task
        eval_task - Evaluation task. Received as a list
        n_steps - the evaluation steps
        output_dir - folder to save your files
    Output:
        trainer -  trax trainer
    '''
    rnd.seed(31) # Do NOT modify this random seed. This makes the notebook easier to replicate
    
    ### START CODE HERE (Replace instances of 'None' with your code) ###          
    training_loop = training.Loop( 
                                classifier, # The learning model
                                train_task, # The training task
                                eval_tasks=eval_task, # The evaluation task
                                output_dir=output_dir, # The output directory
                                random_seed=31 # Do not modify this random seed in order to ensure reproducibility and for grading purposes.
    ) 

    training_loop.run(n_steps = n_steps)
    ### END CODE HERE ###
    
    # Return the training_loop, since it has the model.
    return training_loop

But I think the issue might be in the classifier:

# UNQ_C5 (UNIQUE CELL IDENTIFIER, DO NOT EDIT)
# GRADED FUNCTION: classifier

def classifier(vocab_size=9088, embedding_dim=256, output_dim=2, mode='train'):
    
    ### START CODE HERE (Replace instances of 'None' with your code) ###
        
    # create embedding layer
    embed_layer = tl.Embedding( 
        vocab_size=vocab_size, # Size of the vocabulary
        d_feature=embedding_dim # Embedding dimension
    ) 
    
    # Create a mean layer, to create an "average" word embedding
    mean_layer = tl.Mean(axis = 1)
    
    # Create a dense layer, one unit for each output
    dense_output_layer = tl.Dense(n_units = output_dim)
    
    # Create the log softmax layer (no parameters needed)
    log_softmax_layer = tl.LogSoftmax()
    
    # Use tl.Serial to combine all layers
    # and create the classifier
    # of type trax.layers.combinators.Serial
    model = tl.Serial( 
      embed_layer, # embedding layer
      mean_layer, # mean layer
      dense_output_layer, # dense output layer
      log_softmax_layer # log softmax layer
    ) 
    ### END CODE HERE ###
    
    # return the model of type
    return model
type or paste code here

Thank you!

Surya_Yadav · December 20, 2021, 3:21pm

I am also facing similar issue with Exercise 6.

ai_curious · February 26, 2022, 3:04pm

Any general advice related to this issue? Mine is (mis)behaving similarly

LayerError: Exception passing through layer Serial (in pure_fn):
  layer created in file [...]/trax/supervised/training.py, line 1033
  layer input shapes: (ShapeDtype{shape:(8, 13), dtype:int32}, ShapeDtype{shape:(16,), dtype:int32}, ShapeDtype{shape:(16,), dtype:int32})
...


TypeError: mul got incompatible shapes for broadcasting: (8, 2), (16, 2).

The 16 seems derived from batch size. I assume there is a Transpose missing somewhere, but don’t see it and the unit tests above all passed (for what that’s worth)

I’m a pretty experienced debugger but don’t see an obvious way forward here

sunnyjooey · February 27, 2022, 8:00pm

I don’t remember, but I think it was something in one of the earlier functions that wasn’t quite producing the right shape. Use reshape with -1 whenever possible. Also, if you know you have the right answer, but a particular test isn’t passing, try submitting anyway (I first did this out of pure frustration). Sometimes you’ll pass the test in the submission, even though the individual test failed.

ai_curious · February 27, 2022, 8:17pm

@sunnyjooey, Appreciate the reply. I ended up fixing a couple of things in that file. One was ignoring this comment

        # Using the same batch list, start from neg_index and increment i up to n_to_take

and starting the loop iteration from 0.

The other, more insidious problem was I copied this code from an ungraded cell

compute_accuracy(preds=tmp_pred, y=tmp_targets, y_weights=tmp_example_weights)

into a graded cell where we also were asked to compute accuracy. This meant variables of global scope, tmp_pred, tmp_targets, and tmp_example were in my graded function. I hope you never personally experienced this, but the grader is really not happy with that situation.

Appreciate the help so long after you completed the exercise. Hope this thread helps others, too.

sunnyjooey · February 28, 2022, 2:35pm

No problem! I can’t access my code anymore, so I can’t take a look (should’ve saved it somewhere). But yes, hope this helps others!

Teddy_Huang · May 7, 2022, 10:58pm

Same issue here! Any updates?

Teddy_Huang · May 7, 2022, 11:02pm

Where is the admin? This issue seems persistent! Can we get any updates?!

Topic		Replies	Views
Error on C3W1 Exercise 06 Coding Assignment NLP with Sequence Models week-1	11	639	June 24, 2023
TypeError in Course 3 Week 2 Exercise 5 NLP with Sequence Models week-2	9	736	March 31, 2023
C3_W1_Assignment: LayerError: Exception passing through layer WeightedCategoryCrossEntropy NLP with Sequence Models week-1	2	434	June 19, 2023
C3 W1 Exercise 6 NLP with Sequence Models week-1	12	802	July 26, 2022
C3_W1_Assignment - Exercise 6 - train_model NLP with Sequence Models week-1	2	507	April 24, 2023

C3W1 assignment exercise 6: incompatible shapes for broadcasting: (8, 2), (40, 2)

Related topics