I hit a problem when I try to finish this block:
def next_symbol(NMTAttn, input_tokens, cur_output_tokens, temperature)
Here’s my code. Not sure where it went wrong, but it keeps saying "Output must be a tuple of size 2 containing a integer and a float number. " I think it’s because of the model output line:
log2 will return a float value, you want to take np.ceil of that float
{Moderator’s Edit: Solution Code Removed}
You want to fetch the log probabilities of the last token, hence -1 in the the second index. 1st index is for batch, since there’s only 1 batch, we can pass 0 in the first index.
type casting to int in these two lines -
{Moderator’s Edit: Solution Code Removed}
These shouldn’t be a problem ideally, but I was getting errors without this
This is just my understanding of this week, feel free to correct me if I went wrong somewhere. hope this helps
I’m having the same issue. The problem is, it seems my code is not even being executed. I tried adding some prints inside, then even raising an exception but still I see no other output except:
Output must be a tuple of size 2 containing a integer and a float number
Expected output: [7283, -9.929085731506348]
I struggled for nearly couple of hours with this. Instead of calling model try with NMTAttn. I was so used to calling with model(params) and that fixed the issue for me.
Hi, I have implemented the next symbol function as advised, and it still don’t work. Can you give me an advisse how to solve it? Thank you very much. Charles.
Hi, I have finally found the problem. Very silly problem. I have forgotten the tl. in front of logsoftmax_sample and i put the key word shape in front of (1, int(padded_length)) which generated another error
This has to do with the notebook setup. The only way to run the function is by running the unittests block. In the unit tests, the execution is wrapped in a try-catch block. As a result, if your code throws any error (before your print statement), it will result in printing the standard catch block output (and not the error thrown by your function).
As this makes debugging very difficult, I would suggest the course organisers to add a block to run the code outside of a try-catch block.
Thanks a lot to all of you for your contribution towards this thread. Since, as you all highlighted and as I faced it myself, the implementation of the test-case for this function is done using try-except, which makes it pretty difficult to debug this function, hence, I will be making an issue for this right away. Additionally, since this thread, contains solution code which is against the community guidelines, I will be removing all the solution code from this thread, summarize all the possible mistakes and suggestions to fix them, and will close this thread, so that the future learners can refer to this thread meanwhile the necessary changes are done.
Hey Learners,
If you are facing an issue while debugging next_symbol, for instance, your print statements are not working, then this is because of the way test_next_symbol is implemented in the w1_unittest.py file. The implementation uses try-except clauses, which suppresses all the print statements (if there is an error), that you add to the function to try and debug the function itself. We have created an issue for this, and the team is working on finding a work-around for this. Once it is fixed, we will make sure to update this thread. In the meanwhile, you can try and refer to the possible suggestions, which are a summary of the discussions till this point in this thread.
Possible Mistakes
Take special care into how you obtain the padded_length from the token_length. Create a new code cell, and see if your code produces the following results:
token_length = 8; padded_length = 16
token_length = 9; padded_length = 16
In short, the exponent must be rounded-off using the ceil function, and not the final output. Additionally, make sure that padded_length is an integer for the next steps, if not, then use int() to convert it into one.
Make sure that when you call the model, you don’t use model since that is just used in the markdown and doesn’t denote any parameter that is passed to the function. You must use NMTAttn which is a parameter for the function.
When you index output to get the log_probs, you need to get the log probabilities from the last token output. Note that the last token is denoted by 3 just in the markdown, and it is not true for every example. Think about how you can access the last element of any list in Python (Hint: The length of the list is given to you).
Note that logsoftmax_sample is not a custom function that is pre-loaded into the notebook. It is a function of trax.layers, so you must call it with tl.
Additionally, when specifying the temperature for the logsoftmax_sample function, don’t hard-code it, you have to pass the temperature which is a parameter for the function.