In this week’s example of creating a chatbot, the lecture and the lab fine-tuned the model to make a good summary for a dialog data. I wonder if this is know as a good strategy for making a chatbot. We could also train the model to generate a response given the conversation before. Is there some reason why we should not do so?
My question is if “training on summarization” is known as a better approach than “training to generate response” (and if so, why?), or it is just for one approach that has been chosen for the demonstration.
LLMs don’t remember previous interactions by themselves. If you want to keep the context of previous messages, you must use some memory. One example is the OpenAI API, where you must send all previous conversations with the new prompt to GPT to consider the whole conversation to generate an answer.
One limit to memory is the maximum number of tokens each LLM accepts. You could summarize previous conversations and use the summary instead of all previous messages to implement a more efficient memory.
@leonardo.pabon Thank you. In that scenario, does the summarization of conversation works as an auxiliary role to generate the summarized context passed to the prompt?
Yes. If you use Langchain to implement your chatbot, it provides a few options for memory. Some of them use summarization. It will only work if the model in use can create a good summary of the previous conversation.