Would it be a good strategy to train on "summarization" for creating a chat bot?

KotaMori · October 18, 2023, 4:03pm

In this week’s example of creating a chatbot, the lecture and the lab fine-tuned the model to make a good summary for a dialog data. I wonder if this is know as a good strategy for making a chatbot. We could also train the model to generate a response given the conversation before. Is there some reason why we should not do so?

My question is if “training on summarization” is known as a better approach than “training to generate response” (and if so, why?), or it is just for one approach that has been chosen for the demonstration.

leonardo.pabon · October 19, 2023, 2:21am

LLMs don’t remember previous interactions by themselves. If you want to keep the context of previous messages, you must use some memory. One example is the OpenAI API, where you must send all previous conversations with the new prompt to GPT to consider the whole conversation to generate an answer.

One limit to memory is the maximum number of tokens each LLM accepts. You could summarize previous conversations and use the summary instead of all previous messages to implement a more efficient memory.

KotaMori · October 19, 2023, 2:25am

@leonardo.pabon Thank you. In that scenario, does the summarization of conversation works as an auxiliary role to generate the summarized context passed to the prompt?

leonardo.pabon · October 19, 2023, 2:28am

Yes. If you use Langchain to implement your chatbot, it provides a few options for memory. Some of them use summarization. It will only work if the model in use can create a good summary of the previous conversation.

KotaMori · October 19, 2023, 2:44am

Thanks a lot.

Topic		Replies	Views
What ml architecture is needed for a story making model? AI Discussions ai-discussions	2	75	February 8, 2024
Gen AI implementation - Summarizer bot AI Discussions week-1 , ai-discussions	0	74	August 7, 2024
Week 2: Intuition check for Step 2.1 in "Perform Full Fine-Tuning" Generative AI with Large Language Models week-2	3	426	July 24, 2023
Why only Text Summarisation? Generative AI with Large Language Models feedback , week-1 , week-2 , week-3	2	308	February 27, 2024
Theory into practice: Generative AI lifecycle GenAI with LLMs Resources	8	674	July 21, 2023

Would it be a good strategy to train on "summarization" for creating a chat bot?

Related topics