ChatGPT behaviour now differs from videos

pezza · July 17, 2023, 6:37am

The chatgpt responses are now sometimes different from presented in the videos. For example:

It marked the student’s answer as correct, when it marked it as incorrect in the video
In the final video it gave the Evaluation a ‘D’ when it gave it an ‘A’ in the video

I assume that openAI has made changes to the model since the video was recorded.

However, the videos seem to assume that once you have iterated a prompt and made it work well, it will continue to work well into the future. The above seems to show that the performance of existing prompts can change quite markedly, even for the same user input.

How is this problem dealt with in industry?

Juan_Olano · July 17, 2023, 1:18pm

Hi @pezza ,

We should not forget that the models are 100% probabilistic. Every time you run the model, even with the same prompt, you may get a different result, because the model will always generate the next token based on probabilities.

In Python, we can plant a ‘seed’ to the random, and this will guarantee that the random generation is always the same.

In LLM models, a prompt cannot be seen as a ‘seed’. We could expect very similar responses by using the very same prompt, but then we can get different ones, as you are experimenting.

How is this dealt with in the industry? LLMs most probably are not being used to get definitive/final answers based on prompts alone. A robust application will make use of other resources, and will use the LLMs for what they are good at, but not as a knowledge source by and in itself.

Thoughts?

TMosh · July 17, 2023, 5:15pm

Everything in the LLM industry is in a state of constant change. Any published material becomes obsolete before the pixels are dry.

egutierrez · February 11, 2024, 1:59pm

But in theory, if you use temperature =0, as set in helper function, get_completion…it should not be probabilistic, many runs of same prompt should give the same result.

Te lo pongo en español porque como ves mi ingles no es muy bueno, en teoria en el curso si utilizas la funcion auxiliar get:completion…tiene el endpoint client.chat.completions.create y le pasas el parametro temperature =0. Con temperature =0, el resultado deberia ser determinista, es decir diferentes ejecuciones dan un mismo y predecible resultado. Si modificas el parametro hasta valores mas altos si que puede dar un resultado probabilistico. Al menos entiendo eso de los cursos y de la documentacion.

Gracias

Topic		Replies	Views
Gpt turbo results improved without inference or prompt update. why? ChatGPT Prompt Engineering for Developers	1	140	February 9, 2024
Temperature=0 is this deterministic? ChatGPT Prompt Engineering for Developers	1	447	April 29, 2023
Notebook behaviour not matching the video ChatGPT Prompt Engineering for Developers	10	218	July 29, 2023
Never change a running model? AI Python for Beginners week-module-3	8	38	January 9, 2025
Prompt through GUI Vs API When Using Openai ChatGPT ChatGPT Prompt Engineering for Developers	4	109	May 19, 2023

ChatGPT behaviour now differs from videos

Related topics