My first question is: how do I use ‘freqs’ to compute for Npos and Nneg, when it seems counterintuitive to access the tuple item in a dictionary? Should I be using y labels instead? The videos were amazing but the freqs thing in the assignment is throwing me off.
and the second question is about the V: it is supposed to be the sum of every unique word, correct? I seem to get lost after there was an indication that Younes said something wrong in the video.
I don’t see how come using
train_y would make it easier to compute
N_neg, given that we have the
freqs dictionary pre-computed. Let’s say we adopt the approach you mentioned. So, for each of the tweets, first we will find the label. Then we have to process the tweet (we have already done this once to compute the
freqs dictionary) and iterate over the words, and increment the word’s associated +ve or -ve frequency by 1.
But since we already have the
freqs dictionary, the task is just to iterate over the words, which seems an easier one to me. Please remember that
N_neg are the counts of positive and negative words respectively, and not the count of positive and negative tweets. Let me know your opinion on this.
Yes, you are correct.
V represents the number of unique words in our vocabulary.
Thank you, apparently, I totally missed the later lines while blindly looking for ways to work around it myself. I have also solved this through your comment along with the help of other forums as well. Thank you so much!