Hey Guys,
Towards the end of Week 2’s Assignment, it introduces the Gumbel Distribution, and uses it to predict the characters. However, our model produces results in the form of probability distributions over the entire vocabulary.
So, instead of using the function gumbel_sample
, why don’t we simply pick the character which is the most likely one? And if we want to generate different sequences for the same prefix
, then we can pick characters as per the probability distributions produced, i.e., each character will be picked as per it’s probability.
I don’t understand the use case of the function gumbel_sample
in this, like does it produce better results anyhow, and if yes, then how?
Cheers,
Elemento