C1W1 - frequency extraction discrepancy between explanation and implementation

In C1W1 , it is shown that for a tweet the positive and negative frequencies are calculated based on the positive and negative frequencies of the correspnding unique words in the sentence.
Following tweet is given:
‘I am sad, I am not learning NLP’
positive frequency is calculated as 8, but ‘I’ and ‘am’ are actually repeated 2 times, so the frequency should be more than 8.
Uniqueness is only considered in this part. In coding session, we notice that uniqueness is not considered at all. It is also noticeable from the graded programming assignment, where we get different prediction results for the tweets ‘Great!’ and ’ Great, great’

Hi @Tahir4,

Could you please provide a link to the lecture video you are referring to?

Hi NLP Mentor,

I have a similar question about the word count for positive and negative. I don’t understand why “happy” and “because” are not counted for the positive tweets.

Course link:

Another, in the formular, Xm = [ 1, sum(freqs(w,1), sum(freqs(w,0)], could you explain the explain the parameters for 1, w? ( is it 1 presents one input str?)

A similar question also relevants to another video in C1. How to choose the words and count them?

Course link

image:

Thank you so much,
Ada

It’s been a long time since I watched that lecture, so I may be missing the context here, but I think the point is that they are in the vocabulary but are not in the tweet that is being processed there. That is “I am sad, I am not learning NLP”.