With images it was pritty simple, we used ImageDataGenerator with parameter to rotate, shift, transform, zoom images and it helped a lot with overfitting. As I understood correctly with text only way to get rid of overfitting is to add more data, right? We have explored different layers like Conv1D, LSTM, GRU and we can’t get validation accuracy around at least 95%, but with images we achived it easily. So only way to raise the percent of validation accuracy is to add more data?
Adding more data can help generalize learning, but it may not be the only solution. Other factors such as model architecture, hyperparameter tuning, and the model itself also impact the validation accuracy.
In situations where we deal with a limited amount of data, it is possible to uncover new information and valuable relationships from the available information. For instance, we can apply transformation techniques, such as column rotation, to generate new perspectives or data representations. Furthermore, in various contexts, it is feasible to explore the interrelation among different data features, potentially leading to the creation of new derived features. This approach ultimately enriches the dataset and provides additional relevant information to the model.
I don’t get it. What do you mean by “transformation techniques, such as column rotation” if we work with text?
@Atom27 I apologize for any confusion. When referring to “transformation techniques, such as column rotation,” I was drawing an analogy from image data processing to text data. In image processing, techniques like rotating image columns can generate new perspectives. Similarly, in text data, there are equivalent strategies that involve manipulating or reorganizing features to derive new insights. For instance, in the context of text data analysis, you can consider techniques like reordering words or tokens in a sentence to explore different syntactic structures or variations.
If there are equal strategies, why we didn’t consider them on practice? I’m still confused(
Hello @Atom27,
I’m not sure if I understood your question. Could you explain it in more detail? The concept I mentioned is indeed applied in practice. Could you provide more information so that I can better grasp what you’re looking for?
@bruno_ramos_martins you mentioned:
In image processing, techniques like rotating image columns can generate new perspectives. Similarly, in text data, there are equivalent strategies that involve manipulating or reorganizing features to derive new insights. For instance, in the context of text data analysis, you can consider techniques like reordering words or tokens in a sentence to explore different syntactic structures or variations.
Correct me if I am wrong but we did not make text transformation like reordering words. I mean that for images the overfitting was more deeple explained and we practiced them how to avoid it and what we can use, but there was no practice with avoiding overfitting for the text
Hello @Atom27 ,
It’s possible that the addressed problem didn’t require this level of depth. When we began the discussion, we explored the strategies used to mitigate overfitting in Natural Language Processing. If this particular class didn’t cover this topic, it doesn’t mean it’s not relevant. For your education, it’s important to keep in mind that this is a valuable area of study. If you have plans to take other courses or delve into different specializations, this topic will be covered in greater detail.
Notice that the strategy employed with images is relatively simpler since we have more freedoms in image transformations, such as rotation, translation, and scaling. In the context of NLP, this becomes slightly more constrained, as not every word order produces the same meaning or even a meaningful output.
Please, let me know if I can help you with anything else.