What happens if Generative model starts hallucinating in the data for CoT

Paramdeep · July 13, 2023, 3:41pm

One of the advantages of using Chain of Thoughts is to avoid hallucinations.

In the example to find if the gold ring would sing in water, the chain of thought uses the logic of density of pear to explain that if the density is less than the density of water, then it floats.

The LLM uses the same logic to see if the gold ring would float of not. But is it possible that it hallucinates about the density of gold and still arrives at the wrong result?

Juan_Olano · July 13, 2023, 3:55pm

I will answer “Yes, it is possible that hallucinations may still happen”. The probability is reduced by the nature of the COT, but still we are talking about a probabilistic model: every next token comes out of a probability.

Paramdeep · July 15, 2023, 9:27am

@Juan_Olano Thanks for your comments. Do we understand internally (from the network perspective) why CoT would perform better than simple prompts?

Juan_Olano · July 15, 2023, 1:05pm

Fully understanding on how / why this works is, as far as I know, not known yet. There’s investigation around this topic of understanding why these models work as they do.

Having said that, why COT may bring better results than a single and some times complex prompt? The model cannot think for it self. The model uses next-most-probably word, so even though it has a very deep understanding of language, splitting tasks into steps is not something that the model does very well, unlike humans: we do it very well and very fast.

So when you give a single/complex prompt to the model, we are hoping for the model’s understanding of language to find the best answer on a big task.

When we give it steps, and we chain these steps (do 1 then take output of 1 and do 2, and so on), we are splitting the task into chunks that we are some how pre-determined to be of high quality each, and we are steering the model towards a higher quality final answer.

Put in other words: with a complex single prompt or a COT, the action in the model is the same. The difference is that with COT we maximize the quality by steering the model’s answers more efficiently.

I hope this makes sense

Topic		Replies	Views
Training an LLM AI Discussions ai-discussions	3	296	April 24, 2024
L4: Chain of Thoughts - Model doesn't complete all the steps? Building Systems with the ChatGPT API	1	177	September 10, 2023
Chain of Thought, PAL, ReAct AI Discussions ai-discussions	2	258	April 23, 2024
Should we use chain of thoughts prompts while instruction tuning the model Generative AI with Large Language Models week-module-3	4	681	July 15, 2023
Map a Problem to an LLM Model? Generative AI with Large Language Models week-module-1	4	626	July 2, 2023

What happens if Generative model starts hallucinating in the data for CoT

Related topics