In this simple one that we just run cell by cell. We have the following code:
updated = optimizer.invoke(
{"trajectories": conversations, "prompts": prompts}
)
In a real application, should we run update, every time user sends a new message? or we should at first separate the messages that are actually feedbacks and use them to update the prompts?