Tokenization in post-training slide

Hi team,

I have a question regarding the following slide:

I got confused about the statement “train tokenizer” in the context of post training. Is that really a thing? Wouldn’t that alter existing token IDs and thus make the existing embeddings unusable? I know that it’s always possible to manually add new tokens to a tokenizer / LLM, but the term “training” confuses me. If it refers to a specific technique, I would appreciate if you could provide a brief reference. Thank you!