Viterbi: Backward Pass

Hi @Sagir_Mehmood

Here is a similar question.

What you confuse is that D matrix represent the path you traveled. The C matrix represent the best probability for that state (t_i) for that word (w_i) (in other, words for all paths). While the D matrix represent how you arrive at that probability.

So, w_5 value is taken from the C matrix, while other values are taken from the D matrix:

  • word w_5 best tag is t_1 because the biggest value in C is 0.01 (for t_1)
  • word w_4 best tag is t_3 because D matrix at (w_5, t_1) indicates 3
  • word w_3 best tag is t_1 because D matrix at (w_4, t_3) indicates 1
  • word w_2 best tag is t_3 because D matrix at (w_3, t_1) indicates 3
  • word w_1 best tag is t_2 because D matrix at (w_2, t_3) indicates 2
  • word <s> best tag is 0 because D matrix at (w_1, t_2) indicates 0

Cheers

2 Likes