Why can’t we just use Best_Probs, with only forward pass and quit Best_Paths & the backtracing?
Let me use the diagram given in the notebook, to clarify my point:
Suppose we use argmax on column 0, the highest probability is -14.32, thus we obtain index number 20, whose corresponding tag is NN. This tag is stored
Moving on to column 1, the highest probability is -25.13, the row index is 40, therefore we get VBZ, which is stored right after the last tag.
Finally in column 2 we get -34.99 as the highest probability. The index is 28, whose corresponding tag is RB, which is stored.
Thus we have stored the sequence NN- VBZ-RB, which is exactly the same sequence that we would get if we use backtracing as well.
So, can you please explain to me for what purpose we need backtracing too, if we get the same path even without it?
Is there any disadvantage in the procedure I described above?