How to get rid of overfitting in NLP?

Atom27 · August 3, 2023, 9:40am

With images it was pritty simple, we used ImageDataGenerator with parameter to rotate, shift, transform, zoom images and it helped a lot with overfitting. As I understood correctly with text only way to get rid of overfitting is to add more data, right? We have explored different layers like Conv1D, LSTM, GRU and we can’t get validation accuracy around at least 95%, but with images we achived it easily. So only way to raise the percent of validation accuracy is to add more data?

saifkhanengr · August 3, 2023, 9:58am

Adding more data can help generalize learning, but it may not be the only solution. Other factors such as model architecture, hyperparameter tuning, and the model itself also impact the validation accuracy.

bruno_ramos_martins · August 3, 2023, 10:13am

@Atom27,

In situations where we deal with a limited amount of data, it is possible to uncover new information and valuable relationships from the available information. For instance, we can apply transformation techniques, such as column rotation, to generate new perspectives or data representations. Furthermore, in various contexts, it is feasible to explore the interrelation among different data features, potentially leading to the creation of new derived features. This approach ultimately enriches the dataset and provides additional relevant information to the model.

Atom27 · August 3, 2023, 10:29am

I don’t get it. What do you mean by “transformation techniques, such as column rotation” if we work with text?

bruno_ramos_martins · August 3, 2023, 10:40am

@Atom27 I apologize for any confusion. When referring to “transformation techniques, such as column rotation,” I was drawing an analogy from image data processing to text data. In image processing, techniques like rotating image columns can generate new perspectives. Similarly, in text data, there are equivalent strategies that involve manipulating or reorganizing features to derive new insights. For instance, in the context of text data analysis, you can consider techniques like reordering words or tokens in a sentence to explore different syntactic structures or variations.

Atom27 · August 3, 2023, 8:19pm

If there are equal strategies, why we didn’t consider them on practice? I’m still confused(

bruno_ramos_martins · August 4, 2023, 5:37pm

Hello @Atom27,

I’m not sure if I understood your question. Could you explain it in more detail? The concept I mentioned is indeed applied in practice. Could you provide more information so that I can better grasp what you’re looking for?

Atom27 · August 10, 2023, 7:10pm

@bruno_ramos_martins you mentioned:

In image processing, techniques like rotating image columns can generate new perspectives. Similarly, in text data, there are equivalent strategies that involve manipulating or reorganizing features to derive new insights. For instance, in the context of text data analysis, you can consider techniques like reordering words or tokens in a sentence to explore different syntactic structures or variations.

Correct me if I am wrong but we did not make text transformation like reordering words. I mean that for images the overfitting was more deeple explained and we practiced them how to avoid it and what we can use, but there was no practice with avoiding overfitting for the text

bruno_ramos_martins · August 11, 2023, 12:47am

Hello @Atom27 ,

It’s possible that the addressed problem didn’t require this level of depth. When we began the discussion, we explored the strategies used to mitigate overfitting in Natural Language Processing. If this particular class didn’t cover this topic, it doesn’t mean it’s not relevant. For your education, it’s important to keep in mind that this is a valuable area of study. If you have plans to take other courses or delve into different specializations, this topic will be covered in greater detail.

Notice that the strategy employed with images is relatively simpler since we have more freedoms in image transformations, such as rotation, translation, and scaling. In the context of NLP, this becomes slightly more constrained, as not every word order produces the same meaning or even a meaningful output.

Please, let me know if I can help you with anything else.

Topic		Replies	Views
Still overfitting on Horse or Humans dataset Convolutional Neural Networks in TensorFlow week-module-2	3	652	October 14, 2022
Week 3: Exploring Overfitting in NLP Natural Language Processing in TensorFlow week-module-3	4	352	May 23, 2023
Training acc 100% Natural Language Processing in TensorFlow week-module-2 , week-module-3 , week-module-4	4	581	September 5, 2022
Trouble with overfitting Natural Language Processing in TensorFlow	1	407	June 12, 2022
Week1 programming exercise question, how can we know if the RNN/LSTM used for sequence generation is overfitting the train data? Sequence Models coursera-platform	8	400	October 11, 2023

How to get rid of overfitting in NLP?

Related topics