LLM Post processing with Streaming

I plan to use OpenAI LLM for retrieval augmented generation, ensuring no bias or incorrect information in responses. The ‘LLMOps’ course discusses implementing a post-processing filter. However, waiting for the complete response before applying post-processing may cause delays. Streaming, similar to ‘Chat-GPT,’ is an option but poses challenges for post-processing. Seeking community guidelines on handling such situations.