C4W3_Assignment Assignment3 Error

As I am unable to create simple function.
Exercise 3 - Implement the question answering function.
Anyone here help me regard my last exercise of course NLP with attention models.
notebook of NLP with attention models week3

Hi @m_hassan

Here are the steps for the answer_question for the “Expected Output:” cell example. Where “question” is:

question: When was the Chechen-Ingush Autonomous Soviet Socialist Republic transferred from the Georgian SSR? context: On January 9, 1957, Karachay Autonomous Oblast and Chechen-Ingush Autonomous Soviet Socialist Republic were restored by Khrushchev and they were transferred from the Georgian SSR back to the Russian SFSR.

And “answer” is:

answer: January 9, 1957

Steps you have to implement

(and the intermediate values that you can check against):

QUESTION SETUP

Step 1:
# Tokenize the question
print(tokenized_question)
<tf.Tensor: shape=(79,), dtype=int32, numpy=
array([  822,    10,   366,    47,     8,  2556,  1559,    18,  1570,
         122,  8489,  2040,  3114,  1162, 12873,  2730,   343,  5750,
       10250,    45,     8,  5664,    29,   180,  6857,    58,  2625,
          10,   461,  1762,  9902, 24011,     6, 17422,  3441,    63,
        2040,  3114,  1162,   411, 21234,    11,  2556,  1559,    18,
        1570,   122,  8489,  2040,  3114,  1162, 12873,  2730,   343,
        5750,   130, 13216,    57, 13495, 17363, 13847,    11,    79,
         130, 10250,    45,     8,  5664,    29,   180,  6857,   223,
          12,     8,  4263,     3,  7016,  6857,     5], dtype=int32)>

Step 2:
# Add an extra dimension to the tensor
print(tokenized_question)
tf.Tensor(
[[  822    10   366    47     8  2556  1559    18  1570   122  8489  2040
   3114  1162 12873  2730   343  5750 10250    45     8  5664    29   180
   6857    58  2625    10   461  1762  9902 24011     6 17422  3441    63
   2040  3114  1162   411 21234    11  2556  1559    18  1570   122  8489
   2040  3114  1162 12873  2730   343  5750   130 13216    57 13495 17363
  13847    11    79   130 10250    45     8  5664    29   180  6857   223
     12     8  4263     3  7016  6857     5]], shape=(1, 79), dtype=int32)

Step 3:
# Pad the question tensor
print(padded_question)
[[  822    10   366    47     8  2556  1559    18  1570   122  8489  2040
   3114  1162 12873  2730   343  5750 10250    45     8  5664    29   180
   6857    58  2625    10   461  1762  9902 24011     6 17422  3441    63
   2040  3114  1162   411 21234    11  2556  1559    18  1570   122  8489
   2040  3114  1162 12873  2730   343  5750   130 13216    57 13495 17363
  13847    11    79   130 10250    45     8  5664    29   180  6857   223
     12     8  4263     3  7016  6857     5     0     0     0     0     0
      0     0     0     0     0     0     0     0     0     0     0     0
      0     0     0     0     0     0     0     0     0     0     0     0
      0     0     0     0     0     0     0     0     0     0     0     0
      0     0     0     0     0     0     0     0     0     0     0     0
      0     0     0     0     0     0     0     0     0     0     0     0
      0     0     0     0     0     0]]

ANSWER SETUP

Step 4:
# Tokenize the answer
# Hint: All answers begin with the string "answer: "
print(tokenized_answer)
tf.Tensor([1525   10], shape=(2,), dtype=int32)

Step 5:
# Add an extra dimension to the tensor
print(tokenized_answer)
tf.Tensor([[1525   10]], shape=(1, 2), dtype=int32)

Step 6:
# Get the id of the EOS token
print(eos)
tf.Tensor(1, shape=(), dtype=int32)

Step 7:
# Loop for decoder_maxlen iterations

___First loop iteration (i=0):


Step 7.1:
    # Predict the next word using the model, the input document and the current state of output
print(next_word)
tf.Tensor([[1762]], shape=(1, 1), dtype=int32)

Step 7.2:
    # Concat the predicted next word to the output 
print(tokenized_answer)
tf.Tensor([[1525   10 1762]], shape=(1, 3), dtype=int32)

Step 7.3:
    # The text generation stops if the model predicts the EOS token

tf.Tensor([[False]], shape=(1, 1), dtype=bool)


___Second loop iteration (i=1):


Step 7.1:
    # Predict the next word using the model, the input document and the current state of output
print(next_word)
tf.Tensor([[9902]], shape=(1, 1), dtype=int32)

Step 7.2:
    # Concat the predicted next word to the output 
print(tokenized_answer)
tf.Tensor([[1525   10 1762 9902]], shape=(1, 4), dtype=int32)

Step 7.3:
    # The text generation stops if the model predicts the EOS token

tf.Tensor([[False]], shape=(1, 1), dtype=bool)


___ … other remaining loop iterations

(4 total, the 4th step predicts ‘EOS’ token and the loop stops with the final answer:)

print(tokenized_answer)
tf.Tensor([[ 1525    10  1762  9902 24011     1]], shape=(1, 6), dtype=int32)

Which stands for:

‘answer: January 9, 1957’

Cheers

This is the error message I am encountering.

First question is if you are using the updated version of the NLP Specialization, because the updated version has different line 14:

# Tokenize the question

and not:

# Tokenize question and context

Make sure you’re using the latest Course materials first.

Second, to address your error - the SentencePiece tokenizer does not have the encode method.

Look at the previous usage of the tokinizer in the notebook. You will see that we make use of tokenize or detokenize or other methods, but never encode.

Cheers

1 Like

I am doing on updated version of NLP specialization. @arvyzukai

I would advise to refresh the notebook (read carefully and save your prior work) because the code comments do not match the latest Assignment (have you changed the code comments?)