How to get chat history to work as an API on a server (IIS)

we created and shared our implementation about how to use chat-history here:

It is working locally somehow.

But now we need to bring it to a Windows Server running with IIS.

So it is threaded, the python - server-process can be restartet and so on.

I have started creating an UUID after the first question, sending this UUID with the first answer to the UI (another server) and then passing this UUID to the API again on followup questions.

This works somehow, but only as long as the python server process was not restarted in the mean time.

So I started trying to serialize (using Python Cache) the result of “qa” variable
(created by ConversationalRetrievalChain.from_llm).

This does not work, as there is the DB chroma connection inside and active DB connections can’t be serialized. (the “retriever”)

So there must be a way to work around this limitation.
Kind of store/serialize the other stuff and put in the retriever with the “new/updated” db connection handler?

Are their any examples on Github or elsewhere about how to make the chat_history somehow persistent (saving it to a database or better to a cache on disk)?

Thank you for any hint about this,