I’m running into two issues in l4-summarizing notebook:
- ChatGPT doesn’t always listen to the prompt instructions. One of the prompt details for the last summarization example is
Summarize the review below, delimited by triple backticks in at most 20 words.
. The summarizations of the first three reviews (the panda, the standing lamp, and the electric toothbrush) is always less than 20 words. However, the summarization for the blender review is not always less than 20 words, sometimes it is ~70 words long:
17 word summarization:
Mixed review of a blender system with price gouging and decreased quality, but helpful tips for use.
67 word summarization:
The product was on sale for $49 in November, but the price increased to $70-$89 in December. The base doesn’t look as good as previous editions, but the reviewer plans to be gentle with it. A special tip for making smoothies is to freeze the fruits and vegetables beforehand. The motor made a funny noise after a year, and the warranty had expired. Overall quality has decreased.
Does anyone have any insight into why ChatGPT sometimes does and sometimes doesn’t listen to the prompt instructions? I’ve run into this issue with response formatting where some responses are formatted according to the prompt instructions, e.g. JSON, and some responses are not. The inability to rely on consistent output formatting makes it very difficult to use ChatGPT in production because the inconsistency with regard to the prompt instructions sometimes breaks downstream logic, e.g. saving a response as a python dictionary when the response is not formatted as a JSON. This brings me to my second issue:
- Variability in the response even when temperature is set to 0. I thought that the responses should be deterministic when the temperature is set to 0, i.e. the same input gives the exact same output every time. However, as you can see from the example above, I’m getting two very different responses for the exact same prompt even though the temperature is set to 0. I ran the same cell over and over and ChatGPT randomly switched between the short answer and the long answer. Does anyone know why this is happening?