The chatgpt responses are now sometimes different from presented in the videos. For example:
- It marked the student’s answer as correct, when it marked it as incorrect in the video
- In the final video it gave the Evaluation a ‘D’ when it gave it an ‘A’ in the video
I assume that openAI has made changes to the model since the video was recorded.
However, the videos seem to assume that once you have iterated a prompt and made it work well, it will continue to work well into the future. The above seems to show that the performance of existing prompts can change quite markedly, even for the same user input.
How is this problem dealt with in industry?
Hi @pezza ,
We should not forget that the models are 100% probabilistic. Every time you run the model, even with the same prompt, you may get a different result, because the model will always generate the next token based on probabilities.
In Python, we can plant a ‘seed’ to the random, and this will guarantee that the random generation is always the same.
In LLM models, a prompt cannot be seen as a ‘seed’. We could expect very similar responses by using the very same prompt, but then we can get different ones, as you are experimenting.
How is this dealt with in the industry? LLMs most probably are not being used to get definitive/final answers based on prompts alone. A robust application will make use of other resources, and will use the LLMs for what they are good at, but not as a knowledge source by and in itself.
Everything in the LLM industry is in a state of constant change. Any published material becomes obsolete before the pixels are dry.
But in theory, if you use temperature =0, as set in helper function, get_completion…it should not be probabilistic, many runs of same prompt should give the same result.
Te lo pongo en español porque como ves mi ingles no es muy bueno, en teoria en el curso si utilizas la funcion auxiliar get:completion…tiene el endpoint client.chat.completions.create y le pasas el parametro temperature =0. Con temperature =0, el resultado deberia ser determinista, es decir diferentes ejecuciones dan un mismo y predecible resultado. Si modificas el parametro hasta valores mas altos si que puede dar un resultado probabilistico. Al menos entiendo eso de los cursos y de la documentacion.