Viterbi algorithm: Why do we even need the backtracing?

Doron_Modan · October 1, 2024, 9:28pm

Why can’t we just use Best_Probs, with only forward pass and quit Best_Paths & the backtracing?
Let me use the diagram given in the notebook, to clarify my point:
Suppose we use argmax on column 0, the highest probability is -14.32, thus we obtain index number 20, whose corresponding tag is NN. This tag is stored
Moving on to column 1, the highest probability is -25.13, the row index is 40, therefore we get VBZ, which is stored right after the last tag.
Finally in column 2 we get -34.99 as the highest probability. The index is 28, whose corresponding tag is RB, which is stored.
Thus we have stored the sequence NN- VBZ-RB, which is exactly the same sequence that we would get if we use backtracing as well.
So, can you please explain to me for what purpose we need backtracing too, if we get the same path even without it?
Is there any disadvantage in the procedure I described above?

conscell · October 25, 2024, 6:05am

Hi @Doron_Modan,
The diagram given in the notebook shows only parts of the best_probs and best_paths matrices, where the values with the highest probabilities happen to point to previous values with similarly high probabilities. However, this is not generally the case. When we populate these matrices using viterbi_forward() for the i^{th} word in the corpus and the current POS tag j we compute
\displaystyle \mathrm{best\_prob}_{j, i} = \max_k \ \mathbf{best\_prob}_{k, i-1} + \mathrm{log}(\mathbf{A}_{k, j}) + \mathrm{log}(\mathbf{B}_{j, vocab(corpus_{i})}),
\displaystyle \mathrm{best\_path}_{j, i} = \mathop{\mathrm{argmax}}_k \ \mathbf{best\_prob}_{k, i-1} + \mathrm{log}(\mathbf{A}_{k, j}) + \mathrm{log}(\mathbf{B}_{j, vocab(corpus_{i})}). This process doesn’t guarantee that high-probability values will consistently point to other high-probability values in the previous step.

Topic		Replies	Views
Viterbi- forward step: matrix A value extraction NLP with Probabilistic Models week-2	1	255	February 16, 2024
Problem about UNQ_C7 NLP with Probabilistic Models week-2	2	609	August 2, 2022
Problem in Exercise 6 (Viterbi Forward) NLP with Probabilistic Models week-2	2	622	May 31, 2022
C2W2 viterbi backward & forward last columns in best_probs all = 0 NLP with Probabilistic Models week-2	1	28	September 4, 2024
D matrix numbers representing what NLP with Probabilistic Models week-2	5	523	May 26, 2023

Viterbi algorithm: Why do we even need the backtracing?

Related topics