why do we specify the max number of tokens in ConversationSummaryBufferMemory() ?
is it for when summarizing the past conversation, the max number will specify how much the text is going to be summarized or for another reason?
Your guess is correct. See this notebook
I don’t know I think I got confused so let me check whether the thing that I got from the code is right or not
At the very first beginning , the conversation was too small that it didn’t reach the specified maximum number of tokens so the model didn’t have to do any kind of summarization . However, as the chat proceeds , the past conversation started to exceed the max number of tokens as a result the model will start to summarize the messages from the beginning until the number of tokens return back to be <= the max number (so for example if we’ve got 5 past messages then when we summarized the first 2 , the number of tokens was decreased, so the model won’t summarize the rest of the 5 messages and will return them as they are). Is this right?
Right again. Instead of blindly discarding messages from the past, a summary is extracted to ensure that we retain the important information about the messages to be discarded. See this as well.
See ConversationBufferWindowMemory to understand why ConversationSummaryBufferMemory was introduced.
Thank you so much