What is the difference with result of first week?

someone555777 · June 3, 2023, 8:07pm

So, do I understand, that we had near the same thing as in first week, but in another approach that is based on propabilities, and not on logistic functions?

So, what is main difference between them?

paulinpaloalto · June 3, 2023, 9:43pm

One uses Logistic Regression and the other is based on Bayes Theorem.

Neither of these is very commonly used in real NLP applications these days. This is just the beginning of the NLP series and they are starting by showing you some of the classic techniques. They don’t work that well, but (especially in the case of Bayes Theorem) they are very inexpensive from a compute standpoint. So you might find some corner cases where you could use them. But Younes explains the weaknesses in the lectures: you can take the exact same words and create completely different meanings by simply putting the same words in a different order. So the Bayes method, which is just based on the learned sentiment values for individual words, is not very useful. You really need more powerful techniques which you will learn about soon, so stay tuned for Sequence and Attention Models, which are the way people really do NLP these days. They work way better, but are also a lot more expensive to train and run.

someone555777 · June 4, 2023, 10:48am

Yes, I understand, that for real apps are better ways, I only asked about differences of approaches in first and second weeks. Do, I understand right, that they are near the same? At least they have the same result?

paulinpaloalto · June 4, 2023, 3:23pm

They may have a similar result, but the actual methods are different. As I said before, one uses Logistic Regression, which is a “pattern recognition” method. The other uses Bayes Theorem, so it’s more a “correlation” based method.

This was all explained in the lectures. They explained how the two methods work.

paulinpaloalto · June 4, 2023, 4:40pm

Here’s one indicator that the two methods are fundamentally different: notice that in the Logistic Regression case, there is training involving Gradient Descent to minimize a cost function, right? But in the Naive Bayes case, the “training” is just an analytical computation not involving a cost function.

The fundamental data used in both cases is the same: the computed frequencies that a given word appears in a positive tweet or a negative tweet. But how that data is used to generate a prediction function is different.

arvyzukai · June 5, 2023, 6:20am

Hi @someone555777

To complement @paulinpaloalto answer with visuals here is the different approaches of the two (source):

logistic regression tries to find the line(boundary) that separates positive tweets from negative (hence Discriminative model)
naive bayes tries to find probability distributions where the positive and negative tweets are located (hence Generative model because you can simulate new points from the distributions).

They are the building blocks for later course material (we can think of logistic regression as a one layer neural network, and naive bayes as the simple example of more complex generative models).

Cheers

someone555777 · August 17, 2023, 10:26am

so, is Bayes a bit more exact, because it is not rudly dedicate the data, but gives propabilities, that are very similar to weights of words?

paulinpaloalto · August 17, 2023, 4:13pm

Sorry, but I don’t understand what you mean. “rudly” is not a word in English, which is the only language I speak. Maybe you meant “rudely”, but that doesn’t really make sense in context. Or maybe you meant “really”, but what does “really dedicate the data” mean?

I don’t know which method is “better” in any general sense and my guess is that it depends on the situation. If you have a problem you are trying to solve and are considering the two methods, try them both and see which one gives better results.

someone555777 · August 19, 2023, 4:37pm

Yes, I meant rudely, sorry. And in this context I meant that this dedication is enough simple and not enough accurate as Bayes. At least because Bayes gives more information about clasters due to propability distributions.

paulinpaloalto · August 19, 2023, 5:29pm

Sorry but that clause with the word fixed still doesn’t make any sense in English to me at least. Have you tried writing what you want to say in whatever your native language is and then using something like Google Translate to translate it into English? Or does ChatGPT know how to do translation?

someone555777 · August 20, 2023, 11:50am

oh, ok, maybe I meant “roughly”. Is it better?

Topic		Replies	Views
Why logistic regression for NLP when DL methods are superior? NLP with Classification and Vector Spaces week-1	11	769	December 18, 2022
Course1 week1 NLP first week NLP with Classification and Vector Spaces	1	287	October 29, 2021
Doubt in Week 2 coding assignments NLP with Classification and Vector Spaces week-2	9	108	October 22, 2024
What if we have the same frequency score on both a positive and a negative tweet NLP with Classification and Vector Spaces week-1	1	547	December 31, 2021
Baye's rule and Naive Bayes NLP with Classification and Vector Spaces week-2 , week-3	8	582	July 11, 2023

What is the difference with result of first week?

Related topics