Error running NER merge_tokens code in Colab

Simak · October 26, 2023, 4:36am

I am trying to follow along the course and running the NLP exercise #2 - Named Entity Recognition.

The merge_tokens function is as follows:

def merge_tokens(tokens):
merged_tokens =
for token in tokens:
if merged_tokens and token[‘entity’].startswith(‘I-’) and merged_tokens[-1][‘entity’].endswith(token[‘entity’][2:]):
last_token = merged_tokens[-1]
last_token[‘word’] += token[‘word’].replace(‘##’,‘’)
last_token[‘end’] = token[‘end’]
last_token[‘score’] = (last_token[‘score’]+token[‘score’])/2
else:
merged_tokens.append(token)
return merged_tokens

the error message I am getting is: in merge_tokens(tokens)
2 merged_tokens =
3 for token in tokens:
----> 4 if merged_tokens and token[‘entity_group’].startswith(‘I-’) and merged_tokens[-1][‘entity_group’].endswith(token[‘entity_group’][2:]):
5 last_token = merged_tokens[-1]
6 last_token[‘word’] += token[‘word’].replace(‘##’,‘’)

TypeError: string indices must be integers

Can anyone point out what i could be doing wrong? Been staring at it for quite a bit.
Thanks in advance!

Topic		Replies	Views
Typo error in code - L1: NLP tasks with a simple interface Building Generative AI applications with Gradio	0	122	September 13, 2023
W4: Error with tokenizing and aligning labels in Named Entity Recognition Lab Sequence Models	3	567	July 16, 2022
Tokenizer Error on batched=True When Using Different Cloud Service Generative AI with Large Language Models week-2	1	470	May 12, 2024
Week 1 Exercise 6-- start indices must have integer type NLP with Sequence Models week-1	5	567	March 11, 2023
Assignment 2 - Named Entity Recognition (NER)_Exercise 5_masked_accuracy NLP with Sequence Models week-2	6	99	November 5, 2024

Error running NER merge_tokens code in Colab

Related topics