What is the difference with result of first week?

So, do I understand, that we had near the same thing as in first week, but in another approach that is based on propabilities, and not on logistic functions?

So, what is main difference between them?

One uses Logistic Regression and the other is based on Bayes Theorem.

Neither of these is very commonly used in real NLP applications these days. This is just the beginning of the NLP series and they are starting by showing you some of the classic techniques. They don’t work that well, but (especially in the case of Bayes Theorem) they are very inexpensive from a compute standpoint. So you might find some corner cases where you could use them. But Younes explains the weaknesses in the lectures: you can take the exact same words and create completely different meanings by simply putting the same words in a different order. So the Bayes method, which is just based on the learned sentiment values for individual words, is not very useful. You really need more powerful techniques which you will learn about soon, so stay tuned for Sequence and Attention Models, which are the way people really do NLP these days. They work way better, but are also a lot more expensive to train and run.

1 Like

Yes, I understand, that for real apps are better ways, I only asked about differences of approaches in first and second weeks. Do, I understand right, that they are near the same? At least they have the same result?

They may have a similar result, but the actual methods are different. As I said before, one uses Logistic Regression, which is a “pattern recognition” method. The other uses Bayes Theorem, so it’s more a “correlation” based method.

This was all explained in the lectures. They explained how the two methods work.

1 Like

Here’s one indicator that the two methods are fundamentally different: notice that in the Logistic Regression case, there is training involving Gradient Descent to minimize a cost function, right? But in the Naive Bayes case, the “training” is just an analytical computation not involving a cost function.

The fundamental data used in both cases is the same: the computed frequencies that a given word appears in a positive tweet or a negative tweet. But how that data is used to generate a prediction function is different.

1 Like

Hi @someone555777

To complement @paulinpaloalto answer with visuals here is the different approaches of the two (source):

  • logistic regression tries to find the line(boundary) that separates positive tweets from negative (hence Discriminative model)
  • naive bayes tries to find probability distributions where the positive and negative tweets are located (hence Generative model because you can simulate new points from the distributions).

They are the building blocks for later course material (we can think of logistic regression as a one layer neural network, and naive bayes as the simple example of more complex generative models).

Cheers

so, is Bayes a bit more exact, because it is not rudly dedicate the data, but gives propabilities, that are very similar to weights of words?

Sorry, but I don’t understand what you mean. “rudly” is not a word in English, which is the only language I speak. Maybe you meant “rudely”, but that doesn’t really make sense in context. Or maybe you meant “really”, but what does “really dedicate the data” mean?

I don’t know which method is “better” in any general sense and my guess is that it depends on the situation. If you have a problem you are trying to solve and are considering the two methods, try them both and see which one gives better results.

Yes, I meant rudely, sorry. And in this context I meant that this dedication is enough simple and not enough accurate as Bayes. At least because Bayes gives more information about clasters due to propability distributions.

Sorry but that clause with the word fixed still doesn’t make any sense in English to me at least. Have you tried writing what you want to say in whatever your native language is and then using something like Google Translate to translate it into English? Or does ChatGPT know how to do translation?

oh, ok, maybe I meant “roughly”. Is it better?