Failing UNQ_C6 but passing UNQ_C7 - 7283 is token for 'Halllegacy'?

UNQ_C6 is giving:

Expected output:  [7283, -9.929085731506348]
 1  Tests passed
 1  Tests failed

I added some print statements and get:

input_tokens = [[17332   140   172   207     1]]
input = hello world!
symbol = 18477, log_probs[symbol] = -0.38603782653808594, log_probs[7283] = -15.73045825958252
input_tokens = [[17332   140   172   207     1]]
input = hello world!
symbol = 140, log_probs[symbol] = -0.0002193450927734375, log_probs[7283] = -25.069293975830078
Expected output:  [7283, -9.929085731506348]
 1  Tests passed
 1  Tests failed

UNQ_C7 passes so I tried digging a bit deeper:

>>> sampling_decode("hello world", NMTAttn=model, temperature=0.0, vocab_file=VOCAB_FILE, vocab_dir=VOCAB_DIR)
([18477, 140, 290, 1], -0.07449150085449219, 'Hallo Welt')

So then I substituted 7283 for 140…

>>> detokenize([18477, 7283, 290, 1], vocab_dir=VOCAB_DIR, vocab_file=VOCAB_FILE)
'Halllegacy Welt'

This looks like the token map is out of sync with the exercise - “Hallo Welt” is the correct translation.

Do I need to refresh something?

Thanks - Jim

Hi @sfjac

Have you solved the issue? If not you can private message me your Assignment notebook and I will take a look.

Hi arvyzukai,

I submitted the assignment and scored 90 so I’m not blocked, but it was obviously still happening. But the issue may be in the data and that’s a bit big to attach.

Thanks - Jim

Hi @sfjac

You incorrectly implemented padded_length calculation in # UNQ_C6. I private messaged you suggestions.

Please remove your Assignment notebook from your last post.

Thank you. Cheers.

Will try these out. How do I remove an upload? (Will removing it from the markdown delete it from being uploaded?)

That was a dumb error and I’m somewhat surprised it didn’t matter later, though probably it only made things less efficient since I wasn’t padding all the way up to the next power of 2. However, for the test right after UNQ_C6 it wouldn’t matter - that test only calls next_symbol with 0 and 1 input tokens, and for 0 and 1 my mistaken line gave 1 and 2, which are correct. Also, I’ve corrected the error in the padding calculation and still get the same test failure (again, not surprising).

Expected output:  [7283, -9.929085731506348]
 1  Tests passed
 1  Tests failed

Hi @sfjac

You should be able to edit your post and delete the lines with your Assignment notebook.

Also I private messaged you the second mistake - you use global variable model instead of the passed parameter NMTAttn.

Cheers