Hi @Ryan_Llamas
That is a good question but I think you’re mixing up two different things.
Good-Turing Estimation is a statistical technique used to estimate the probability of observing rare events and it was used as a smoothing technique for rare n-grams (in combination with Katz backoff).
In other words, it’s a “smoothing” technique and is more concerned about rare events (rare n-grams would get smoother probabilities). While the temperature parameter helps with “sampling” and is more concerned about the most probable events (if temperature is 0, we would “take” the most probable word every time, and if temperature is let’s say 0.7, we would sample some most probable words according to their relative probabilities).