Hallucination in summarization

I am trying to fine-tune a pre-trained GPT model on summarization. I have manually prepared the training data and ensured that the summary is created entirely from the input text in each and every example. However, after training, during inference the model is hallucinating in the sense that it is presenting ideas that cannot be inferred from input and sometimes even giving entities not even present in the input.

Ques 1: Why could this be happening and how can I correct this?

Ques 2: Is there any way I can automatically evaluate if the generate summary is producing content that cannot be inferred from input?

Hallucinations happen because LLM’s are simply statistical models of the words and sentences they were trained on. They just generate text in what it thinks is a probable sequence. It has no concept of ideas, just sequences of words.

Preventing this behavior is where a lot of current LLM research is working.

My use-case is quite narrow. All I want is for the output to be restricted to the input text provided. I have worked with out-of-the-box Pegasus models for summary and I didn’t see that many hallucinations there. Then we have ChatGPT - I have not seen it hallucinate even once on my summarization use case.

Is it the dataset size, or is it diversity of examples or something else, I am not sure at this point. But for a narrow use-case like mine, I think there should already be a known solution to significantly minimize (if not eliminate) hallucinations.