Stop generation in between (LLM response)

vsharma19 · September 11, 2024, 2:07pm

Team, While using RAG , when response is getting generated, I would like to stop the processing in between and free up any memory LLM is using. Basically “Stop Generation”. is it possible to stop the generation and not just disconnect the call or thread. ? If so , could you advise how ?

Rorisang · September 11, 2024, 4:44pm

Hi @vsharma19 . I have not encountered such a case before. Which LLM are you using? Are you running notebooks or scripts?

vsharma19 · September 12, 2024, 10:07am

Hi,
Well does not matter , simply put - If a RAG is in process there is actually STOP parameter which stops the process, but I want to stop based on user preference. you can use gpt4-32k or 4o

Topic		Replies	Views
Stop generating only after a stop sequence appears LangChain for LLM Application Development	0	132	June 7, 2024
Ask more than 50 questions to an LLM + KeyWords AI Discussions ai-discussions , langchain	0	139	February 19, 2024
Chatbot giving answer out of context I provided from pdf AI Discussions	3	149	December 26, 2023
LLM Agent Streaming with Streamlit AI Discussions ai-discussions	1	89	April 11, 2024
Great Course, real bootstrap to LLM chats. One question Advanced Retrieval for AI with Chroma	0	155	January 8, 2024

Stop generation in between (LLM response)

Related topics