In this lesson, the code under section " Adding a helper function to merge tokens" is missing to append the last_token
variable to merged_tokens
list (in fact reassign new value for the last token of the list).
I believe that part of code shalle be ,
last_token = merged_tokens[-1]
last_token['word'] += token['word'].replace('##', '')
last_token['end'] = token['end']
last_token['score'] = (last_token['score'] + token['score']) / 2
# Missing code
merged_tokens[-1] = last_token
I have made the following changes for better result of mergers
if (merged_tokens and token['word'].startswith('##')) or (merged_tokens and token['entity'].startswith('I-') and merged_tokens[-1]['entity'].endswith(token['entity'][2:])):
last_token = merged_tokens[-1]
last_token['word'] += token['word'].replace('##', '')
last_token['end'] = token['end']
last_token['score'] = (last_token['score'] + token['score']) / 2
merged_tokens[-1] = last_token
Hope the above helps.