How is it possible that the model response is same everytime?

ai_curious · February 5, 2026, 3:39pm

Thanks for adding these to the conversation. I notice that one layer down from the llama code you linked above, sample_top_p calls torch.multinomial , which is consistent with the older GPT-2 style code I have. Also, the effect of low T on softmax you derived is consistent with my anecdotal evidence graphed on my linked thread about Temperature. You can clearly see the shift towards a single top candidate well before T approaches 0.

Graphs linked here

I started learning GPT-2 using a tensorflow and keras implementation I found on the web from François Chollet, but switched to pyT when I had environmental incoherence and couldn’t resolve in a reasonable time. Now that I have a working pyT environment I should probably bring down the llama code and tinker with it. Thanks for the impetus.

Topic		Replies	Views
Temperature=0 is this deterministic? ChatGPT Prompt Engineering for Developers	1	536	April 29, 2023
Is a model deterministic? Generative AI with Large Language Models week-module-1	1	150	May 14, 2024
ChatGPT behaviour now differs from videos Building Systems with the ChatGPT API	3	214	February 11, 2024
The "Stochastic Nature" of LLMs Short Courses FAQ faq , short-course , dl-ai-learning-platform	0	34	February 2, 2026
C4_W1_Q5: Test Inconsistency NLP with Attention Models week-module-1	5	335	January 3, 2024

How is it possible that the model response is same everytime?

Related topics