Custom NER unable to recognize new entities

Mayank11 · May 11, 2023, 4:40pm

I am unable to bring the NER model to recognize entities that were not seen in the training set. For example, I am trying to train the model on what could be different types of bank products - savings account, credit card, Line of credit, etc and included examples of these in the training set. However when similar example with “current account” instead was given, the model didn’t recognize. There are quite a few such cases.

In fact even in the OOB spacy package, the NER is unable to recognize all names. eg Alan is recognized as person but Trish is not recognized.

Are NER models limited to the entities they have seen during training data? If so, we can rather do a simple dictionary lookup.

What am I missing?

arvyzukai · May 11, 2023, 4:54pm

Could be:

Tokenization,
dataset,
model size/architecture;

In general, it should be added that NER models are not perfect and may not recognize all possible variations of entities, even if they were present in the training data. But if implemented properly they are superior to simple dictionary look ups.

In this C3 W3 Assignment the tokens are words - more sophisticated tokenization would be subwords. The dataset is also for learning purposes and the model is modest in size and architecture.

Mayank11 · May 11, 2023, 5:54pm

Thanks for the reply. I agree and I am quite confused by how can I tokenize multi-word entities. In my example I converted entities of interest to single word eg savings-account, current-account etc but still could not get the model to recognize it.

I guess there is still a long way to go to learn this. Will keep exploring !

arvyzukai · May 11, 2023, 6:44pm

Don’t get me wrong Your flair suggests “Prompt Engineering Learner” and given the formulation of your question, I felt like taking a role of a chatbot

To be clear - if you want to master NLP there is no corner cutting - and you’re right, there’s a lot to learn.

Prompt “Engineering” is becoming a specialization or profession on its own. In my personal opinion, it’s a skill which should become less relevant when models start to understand prompts better (by training on better prompts). (I even heard a joke that Prompt Engineering will make Prompt Engineers obsolete!)

So my (unsolicited ) advice would be to pick your direction and stick to it - if you want to understand NLP at least up to a beginner level, there are little to none corners you could cut - completing the course (on your own) from the start is the least you can do (I assume you did not, since some parts of your question were covered previously in specialization, namely “why just not use dictionary lookup in translating sentences” or something in these lines). I could also get the wrong impression, so don’t judge me hard

Topic		Replies	Views
Week 2: Assignment 2 - Named Entity Recognition (NER) Exercise 1 NLP with Sequence Models week-module-2	4	283	May 19, 2024
What is the use of NER-named entity recognition, specifically? NLP with Sequence Models week-module-3	3	339	October 30, 2023
Name Entity Recognition NLP with Sequence Models week-module-3	2	409	October 24, 2023
Programming Assignment: Named Entity Recognition (NER) NLP with Sequence Models week-module-3	2	475	August 11, 2023
Why using LSTMs for NER NLP with Sequence Models week-module-3	1	510	November 30, 2022

Custom NER unable to recognize new entities

Related topics