The `log_prob`

variable returned by the `sampling_decode`

function is the log probability of the last symbol. Why is it considered to be the log probability of the whole sentence?

That is a very good question and I think you found a mistake I will report it for fixing

Out of curiosity, would the sentence’s probability be the product of the symbols’ probabilities?

Yes, sentence probability would be the product of the symbols’ probabilities, but since we have log probabilities in that assignment, the sentence log probability should be the sum of symbols’ log probabilities.

As you know from the lectures, this is just a part of the picture, since short sentences would have bigger probabilities and people came up with ideas how to account for that.

So it is part of design questions how do you want to go about that, but for sure, the `sampling_decode`

should not output the last symbol’s probability.