Chain of Thought, PAL, ReAct

sujithanumala · April 23, 2024, 11:58am

Hello
I have gone through the research papers of COT, PAL and ReAct and these are the questions I have.

So in COT do we fine tune a pretrained LLM with COT prompts ( I mean do we give a question followed by COT and then let the model generate the answer).
At inference we are not providing any COT but the model is able to generate good answers for reasoning tasks.
can you explain me how exactly we train/finetune using COT/PAL / ReAct.
I mean what are inputs and outputs while we finetune/train the model like inputs are context+ COT prompts and outputs are answers (or) inputs are just context and outputs are COT+ answers.

TMosh · April 23, 2024, 5:14pm

You’re posting the same flavor of question in many different forum areas.
That’s typically not a successful method.

sujithanumala · April 23, 2024, 6:18pm

I am sorry for that. Will make sure not to repeat that next time.
Thank you

Topic		Replies	Views
Should we use chain of thoughts prompts while instruction tuning the model Generative AI with Large Language Models week-3	4	617	July 15, 2023
Training an LLM AI Discussions ai-discussions	3	228	April 24, 2024
Week 2: Intuition check for Step 2.1 in "Perform Full Fine-Tuning" Generative AI with Large Language Models week-2	3	425	July 24, 2023
Can you mix and match different types of data? Finetuning Large Language Models	2	107	September 21, 2023
What happens if Generative model starts hallucinating in the data for CoT Generative AI with Large Language Models week-3	3	413	July 15, 2023