Test vocab size mismatch for Exercise 1 Updated NER assignment

When creating a post, please add:

  • Week # must be added in the tags option of the post.
  • Link to the classroom item you are referring to:
  • Description (include relevant info but please do not post solution code or your entire notebook)

Hello @arvyzukai

My test vocab size do not match. I recalled the get vocabulary as list(sentences), I have played or changed this by trying train_sentences, or test_sentences, I didn’t get desired output.

Sharing an image

Regards
DP

1 Like

Hi @Deepti_Prasad

Were you following the instructions for:
# Define TextVectorization object with the appropriate standardize parameter

which are:

In this section, you will use tf.keras.layers.TextVectorization to transform the sentences into integers, so they can be fed into the model you will build later on.

also note:


That said, you will use standardize = None so everything will just be split into single tokens and then mapped to a positive integer.

If you did, then to get the vocab size you can use a simple method

Let us know if that helps :slight_smile:

1 Like

I have used both the instructions.

also the vocab size link which you sent I used even that already, it was giving me a different error NameError: name ‘get_vocabulary’ is not defined. are you by any chance stating me to use that token instruction in the get_vocabulary code??

Regards
DP

1 Like

Ok, in the first step:
# Define TextVectorization object with the appropriate standardize parameter
you create a sentence_vectorizer (like in the instructions)

for the second step:
# Adapt the sentence vectorization object to the given sentences
you use the adapt method. Note in TensorFlow it just “adapts” and does not return (in other words, you just replace the two None in this case (None.adapt(None))

for the third step:
# Get the vocabulary
you use the get_vocabulary and this time it returns the value you need.

Cheers

1 Like

for this it states to use tf.keras.layers.TextVectorization( with standardize being set to None hence that I had used.

then in the second step I also replaced only the two None’s with sentence_vectorizer and sentences

for Get vocabulary, the link you sent tells me to get_vocabulary with the special_token, and when I used I get the error I shared with you in previous comment.

Regards
DP

1 Like

For the third step you also use the sentence_vectorizer and use the .get_vocabulary() on it and that’s it. :slight_smile:

1 Like

I am feeling like a fool :sob: :sob:

Thanks

1 Like

Don’t be. Someone else will come across the same problem and will find the solution here.

Cheers

2 Likes

Haa haaa thank you for solace :joy:

Ok I am getting this msg

is that msg output is normal?

Regards
DP

1 Like

I’m no expert on TensorFlow :slight_smile:, I think the:

os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'

at the start of the Assignment should have suppressed these messages…

Anyways, I’m not sure about this, you’re probably ok with this warning (unless you used for loops somewhere in your code or your train_sentences is not a Tensor or similar)

1 Like

You just love teasing me :rofl::joy:

1 Like