C4W1 - stuck in Ex6

Hi all, I’m stuck in Ex6. All the previous functions run fine, but this one fails, so I don’t think I’m dragging an error around, it must come from this exercise?
I’ve tried to find the error but can’t see it. Could someone look at my notebook and help? Thanks!

This is the error I get:

Hi @Natalia_Tenorio_Maia

What this error suggests is that the test function expects a tuple of int and float, as the function next_symbol() returns:

    ### END CODE HERE ###    
    return symbol, float(log_probs[symbol])

So the symbol should be of type int - make sure that after using tl.logsoftmax_sample(.., temperature=..) method, you wrap it around with an int().

Also the problem could lie in the trickiest part of the exercise - the log_probs. Hints suggests:

The log probabilities output will have the shape: (batch size, decoder length, vocab size). It will contain log probabilities for each token in the cur_output_tokens plus 1 for the start symbol introduced by the ShiftRight in the preattention decoder. For example, if cur_output_tokens is [1, 2, 5], the model will output an array of log probabilities each for tokens 0 (start symbol), 1, 2, and 5. To generate the next symbol, you just want to get the log probabilities associated with the last token (i.e. token 5 at index 3). You can slice the model output at [0, 3, :] to get this. It will be up to you to generalize this for any length of cur_output_tokens.

In other words, the output from the model will be (batch size, decoder length, vocab size) and you need to figure out values for each of these. Hint: for the batch_size it is 0, for the decoder_length is should be your token_length (before padding) and for the vocab_size you should get all the values :

If both of the above are correct, you could private message me your Assignment notebook and I would take a look at it.

Cheers

Hello- I am having a somewhat similar problem to the poster above.

Here is my output from the test function, including some additional print statements at the end of my next_symbol function that show that the function is indeed producing a tuple, of length 2, with an int and a float, and whose values match the expected output from the test function (as seen in the original poster’s image):

Despite this, the test function is still saying that the output is not of the right type?

Any help with this would be greatly appreciated!

Hi @Guy_Cooper

Do not change anything outside:

    ### START CODE HERE ###
...
    ### END CODE HERE ###  

I can see from your screenshot that you changed what the function should return:

    ### END CODE HERE ###    
    return symbol, float(log_probs[symbol])

Note:
Before submitting your assignment to the AutoGrader, please make sure you are not doing the following:

  1. You have not added any extra print statement(s) in the assignment.
  2. You have not added any extra code cell(s) in the assignment.
  3. You have not changed any of the function parameters.
  4. You are not using any global variables inside your graded exercises. Unless specifically instructed to do so, please refrain from it and use the local variables instead.
  5. You are not changing the assignment code where it is not required, like creating extra variables.

Cheers

Hi @arvyzukai

Thanks so much for the quick reply - I am very grateful for the help!

Sorry, I changed that part of the code while trying to debug the function. I get the same error message without the changes:

Is your symbol variable of int type?

Hi @arvyzukai

Yes. The commented out print statements show that the output of the function is a tuple of length two, where the first element (symbol) is of class ‘int’ and the second element is a ‘float’. These print outputs can be seen in the image from my first post.

Thanks for the help!

Hi @arvyzukai

I was able to solve the issue by looking at examples on another thread related to this function. I think the problem arose from a funny way that I was shaping the padded_with_batch array.

Thank you very much for your time and help!

G

1 Like

Hi @Victor_Luu

Some minor mistakes:

  • In # UNQ_C2 you forgot to pass correct mode to tl.ShiftRight
  • In # UNQ_C3 mask preferably should use !=

Bigger mistakes - you use global variable model instead of NMTAttn, and you also use a print statement - read the important points at the top of the Assignment):

  • In # UNQ_C6 you incorrectly defined padded_with_batch - hints: use np.array() to convert padded variable from list to numpy array; and right after that to add empty first dimension use [None, :]

Please remove your solution notebook from this forum, because it is against the rules.

Thanks, @arvyzukai . I will try these tips.

I’m not able to pass the tests for UNQ6. I have passed the prior UNQ’s and have followed the instructions to define the log_probs. However I get a different output token and probability than expected. Please help. @arvyzukai

For future others, please refer to this post if you’re stuck.

np.expand_dims(a, axis=0) works for me.