UNQ_C11 Problem in understanding start_with variable

I am having difficulty in understanding the role of start_with variable set in the UNQ_C11 function. Can someone explain that, if possible with an example.


Hey @Rvi,
If you take a close look at the notebook, you will find that the example is given just after the code cell. Let me quote them here for your reference.

previous_tokens = ["i", "like"]
tmp_suggest1 = suggest_a_word(previous_tokens, unigram_counts, bigram_counts, unique_words, k=1.0)
print(f"The previous words are 'i like',\n\tand the suggested word is `{tmp_suggest1[0]}` with a probability of {tmp_suggest1[1]:.4f}")

# test your code when setting the starts_with
tmp_starts_with = 'c'
tmp_suggest2 = suggest_a_word(previous_tokens, unigram_counts, bigram_counts, unique_words, k=1.0, start_with=tmp_starts_with)
print(f"The previous words are 'i like', the suggestion must start with `{tmp_starts_with}`\n\tand the suggested word is `{tmp_suggest2[0]}` with a probability of {tmp_suggest2[1]:.4f}")

In both the examples, the previous_tokens are ["i", "like"]. Note the first example. In this, start_with = None, and hence, the output is the most probable word, which is a in this case, with a probability of 0.2727. However, in the second example, start_with = c, so, in this case, the function gives the most probable word which starts with c, which is cat in this case with a probability of 0.0909. Note that, cat is not the most probable word overall, but among the words that start with c, it is the most probable word. Let us know if this helps.


1 Like