C3W1 - Assignment: Assertion error (Wrong length...) for `create_batch_dataset` unit tests

Hi, I’m working on the assignment for Week 1 of the “NLP with Sequence Models” course.

While trying to pass the first “serious” unit tests for the function create_batch_dataset, I’m getting a strange error which I’m struggling to understand.

---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
Cell In[31], line 2
      1 # UNIT TEST
----> 2 w1_unittest.test_create_batch_dataset(create_batch_dataset)

File /tf/w1_unittest.py:55, in test_create_batch_dataset(target)
     53 exp_shape = (BATCH_SIZE, SEQ_LENGTH)
     54 outputs = dataset.take(1)
---> 55 assert len(outputs) > 0, f"Wrong length. First batch must have 1 element. Got {len(outputs)}"
     56 for in_line, out_line in dataset.take(1):
     57     assert tf.is_tensor(in_line), "Wrong type. in_line extected to be a Tensor"

AssertionError: Wrong length. First batch must have 1 element. Got 

My function seems to be “okish”, as the previous informal test cell worked as expected, giving reasonable results… but it’s obviously not completely correct.

Thank you for your help.

1 Like

Hi @castarco

Refer the below comment to resolve your issue. your issue first exist in the line_to_tensor grader cell, check if you correctly convert the chars to tensor using the correct tf function.

Let me know if you still want any assistance.

Regards
DP

1 Like

Hi @Deepti_Prasad . Thank you for your response.

Unfortunately, I don’t see where these instructions differ from what I already did :frowning: .

1 Like

this error is either if you recalled the buffer size and shuffling correctly or not. if you have, then the next error tells

you have converted line to tensor incorrectly. For this you need to refer section

1.3 Convert Line to tensor

Check the below comment

Regards
DP

Hi @Deepti_Prasad .

thank you for your time.

In the end I found what it was, a stupid mistake on my side, but it was neither of those (I passed the same values to the first and second calls to the .batch method).

Best regards.

2 Likes

Hi @castarco I’m running into the same issue. Did you pass seq_length+1 into both calls of the .batch method? Why should either calls be passing in different values?

Best

1 Like

@roses_r_red The two calls to .batch(*) represent two very different “intents”.

If I recall correctly, one was to create sentences of a certain length, and the other to create batches of samples (so all these samples can be treated as a big matrix or tensor, which makes it easier to parallelize/vectorize operations on them and to apply stochastic gradient descend).

It helps trying to identify the steps (without looking at the code) that you should follow to achieve the desired result, and to look at the code only after you have gone through that thinking process on your side, so it becomes easier to identify anything that is a bit “off”.

2 Likes

I fixed the issue! Thank you!

1 Like