C2_W3_lecture_nb_03_oov - smoothing

younglee · June 30, 2023, 3:24am

I want to suggest some changes in the code for “smoothing” part in the lab notebook as below. I would be happy to hear any thoughts.

First, I think there was unintended typo in the function name. There was missing ‘h’ in “smoothing”, so I suggest to change the function name:
add_k_smooting_probability → add_k_smoothing_probability

Second, in the code below from the lecture notebook

trigram_probabilities = {('i', 'am', 'happy') : 2}
bigram_probabilities = {( 'am', 'happy') : 10}

I think it makes sense to change bigram to ('i', 'am'), which is first two words of the trigram.

trigram_probabilities = {('i', 'am', 'happy') : 2}
bigram_probabilities = {( 'i', 'am') : 10}

Third, in a code to compute probability_unknown_trigram, I think passing bigram_probabilities[('i', 'am')] as n_gram_prefix_count argument makes more sense.

probability_unknown_trigram = add_k_smoothing_probabilty(k, vocabulary_size, 
    n_gram_count=0, n_gram_prefix_count=bigram_probabilities[('i', 'am')])

arvyzukai · July 3, 2023, 9:12am

Hi @younglee

Great job for spotting a typo, I will submit it for fixing.

Regarding this point - no, on the contrary - we want to predict the next word. For example, if we have a sentence “i am happy ___to___ learn”, and we do not have the ('i', 'am', 'happy') tri-gram in our table, then we should should use ('am', 'happy') bi-gram instead but not ('i', 'am') since we are trying to predict the word “___to___” (or other words for that matter).

Cheers

younglee · July 6, 2023, 4:31am

Hi @arvyzukai

Thank you for the explanation, but I am confused.

First, I want to clarify that my question is about Add-k smoothing method, not back-off method and not interpolation method.

My understanding is that the purpose of add_k_smoothing_probability() function is to compute probability of observing ('i', 'am', 'happy') trigram conditioning on observing bigram ('i', 'am'). So, here, 'happy is the next word of interest, and the function is to compute P(('i', 'am', 'happy') | ('i', 'am')) even when ('i', 'am', 'happy') trigram does not exist in the training corpus.

Do I correctly understand the purpose of the function add_k_smoothing_probability()?

arvyzukai · July 9, 2023, 9:50am

Hi @younglee

Oh, in that case I think you’re correct. But I cannot verify it since I’m on vacation and verification of code using phone is problematic. I’ll be back after a week, in the mean time, maybe someone else will clarify the situation?

Elemento · July 12, 2023, 5:11am

Hey @younglee,

Yes, you are correct in this. Also, note that the function is not used only when the trigram probabilities are absent, but also, when they are present. If we apply add-k-smoothing to estimate the probability of any one of the trigrams, then the same is done to estimate the probability of other trigrams as well, since then only, we can compare 2 trigrams, as in, which trigram is more likely, given the same bigram probabilities.

As to these, yes you are correct indeed. Thanks a lot for pointing out these discrepancies. Let me raise an issue to get these fixed.

Cheers,
Elemento

Elemento · July 12, 2023, 12:10pm

Hey @younglee,
The discrepancies have been fixed. Once again, thanks a lot for pointing out the discrepancies.

Cheers,
Elemento

younglee · July 14, 2023, 2:09am

@Elemento Thank you!

Topic		Replies	Views
Confused about course demo 3 NLP with Probabilistic Models week-module-3	1	535	June 13, 2022
Linear Interpolation NLP with Probabilistic Models week-module-3	4	595	November 17, 2023
C2_W3 UNQ_8 count_n_grams() NLP with Probabilistic Models week-module-3	5	493	November 10, 2023
C2W3 Assignment Ex 7 & 8 NLP with Probabilistic Models week-module-3	11	740	November 12, 2022
Assignment 3: Language Models: Auto-Complete NLP with Probabilistic Models week-module-3	5	295	March 26, 2024

C2_W3_lecture_nb_03_oov - smoothing

Related topics