Is the C5 W4 Named-Entity Recognition assignment working as intended?

The second ungraded NLP/Transformers (Course 5 Week 4) assignment, which demonstrates use of a Transformer for name entity recognition, is only achieving 70% accuracy and its prediction against the example new sentence

Manisha Bharti. 3.5 years of professional IT experience in Banking and Finance domain

is not producing a useful output (in my case: only clusters of EMTPY and NAME, even where the input is padded). Are results this bad expected? Is the idea here that we play around with the hyperparameters such as the number of epochs to improve it? Or is there an issue with the assignment?

Hey @vorpalsnark,
On the training dataset, I can see that it achieves upto 77% accuracy, as can be seen in the image below, and I guess that’s the only accuracy that the notebook talks about. But even if you are achieving only 70% accuracy, I believe it’s a good start, considering we only trained our model for 10 epochs on just 220 examples.

Screenshot from 2023-06-03 10-39-47

Yes, you can definitely play with the hyper-parameters, and see if you can improve the model’s performance.

You see the major reason behind this is that in the dataset the “Empty” tag is observed very often, while the other tags are observed quite less often. So, if the model just predicts every token as “Empty”, then also it can get a quite high accuracy. This is quite evident from the classification report presented towards the end of the notebook.

Screenshot from 2023-06-03 10-45-34

For tags like “Years of Experience”, “Degree”, “Location”, “College Name”, etc, despite of having around 100 instances, the model gives a precision and recall of 0, i.e., it doesn’t predict any token with these tags. Note that even in this scenario, our model gets an accuracy of 77%, and a weighted avg f1-score of 0.94. Things look good only until we observe the macro avg f1-score, which is 0.23 only. You can read more about micro avg, macro avg and weighted avg here.

To conclude, try to play with the hyper-parameters to see if you can make any major stride in macro avg f1-score. Since the dataset is highly skewed, metrics like accuracy and weighted avg f1-score aren’t apt for judging your model. And lastly, even with 77% accuracy, this is what we can expect from the model, in terms of performance.

Let us know if this helps you out.

Cheers,
Elemento

It’s not DLS C 5 W 4 Assignment, right?

Hey @saifkhanengr,
It’s DLS C5 W4 Second Ungraded Lab.

P.S. - Thanks to this thread only, I raised the concern for the confusion :joy:

Cheers,
Elemento

DLS C5 W4 does have an optional lab on Named Entity Recognition, right after Assignment 1.

The title of this sub-forum mentions Natural Language Processing, which doesn’t actually appear in DLS C5.

The title of this Forum area maybe should be “Sequence Models - Transformer Network”

Oh, OK. I got it. ~~~

Thanks for the detailed feedback @Elemento - I agree it’s a good start and I can get over 80% accuracy against the test set just by increasing the number of epochs, but I’m still not getting a useful named-entity classification against a novel input. I’ll play around with it more, though - it’s a good application of the model.

I don’t think so. I believe there are some errors in mapping tokens to words and then labels. @Elemento . To simply the explanation, let us consider the first resume. Function clean_dataset process the first resume and identify 227 words spliting using ’ ’ and then set up mapping with 227 tags. While, in funtion tokenize_and_align_labels, tokenizer directly process the text of the first resume. Word_idx shows that there are 318 words. Definetly, the logic of identifying word of tokenizer is different from using ’ '. So there are inconsistencies in identifying words but using the same tag matrix. So it must be an error which needs to be fixed.