Both Top P And Top K non zero? How does the model choose

Hi,
I understand TopK and Top P sampling when applied one at a time. But when both are non zero how does the model select the output?

Then we should only be sampling from those that satisfy both the topK and the topP criteria.