D matrix numbers representing what

I do not quite understand where the numbers in D matrix come from.

Can I get some help? I don’t remember I see how these numbers are generated from the lectures. :thinking:

Hi @Fei_Li

It is explained in this course video.

In essence - it’s the path of best probabilities. Similar question

The numbers in D matrix are the indexes of t - which previous tag (t_{i-1}) got you the best probability for this tag in this word.

For example, third column (w_3) indicates that the best way to arrive at t_1, for word w_3 is through the t_3 at word w_2.
In other words, probability is maximum for t_1 at word w_3 if previous word (w_2) was a tag t_3.


Hi @arvyzukai ,

Thank you for your advice last time. I have been thinking of this for a while. Maybe you could give some help: To the last column (word) 's highest probability tag’s ID. So if I use np.argmax(the last column) this would give me the index for the highest probability. is this index the same as the corresponding tag’s ID?

Hi @Fei_Li

I’m not sure I understand perfectly so I just explain it my way :slight_smile:

The last column of C matrix gives you the best probabilities for that word being one of the tags. Like in your image above, the word w_5 has the best probability (0.01) of being tag t_1.

The last column of D matrix tells you “the best tag index” for word w_4. In your image it basically says that word w_5 has the best probability (of 0.01) only when the word w_4 was tag t_3.
(And to extend the example, to illustrate other possibilities for w_5, the third best (note: not second) tag for w_5 is t_4 (0.0003) (ten thousands), but to achieve this probability the word w_4 must have been tag t_1)

So in some sense you could consider D matrix “lagging” compared to C matrix.

Hi @arvyzukai ,
Thank you very much for your detailed explanation. I now understand the flow of the two tables.
I am still confused here: I use np.argmax(w5’s column) to get 1, which is the first tag t1. but how do I get t1’s label: (6) ? (I write some tags and states on the graph in red just for a more straightforward illustration of my confusion. )

Hi @Fei_Li

When you use argmax you get the index of the row. To get the actual tag name you just use that index on the tag names list. For example, if argmax returns you 2, then use the tag on[“tag_v”, “tag_n”, “tag_a”, “tag_b”, “tag_j”] and the result would be “tag_a” (since index starts at 0).

In the assignment, the last tag index is k which you store in z[m-1] (last value of list z).
So, first you:

    # Go through each POS tag for the last word (last column of best_probs)
    # in order to find the row (POS tag integer ID) 
    # with highest probability for the last word
    # your solution

* By the way, this could have easily been achieved with max and argmax, but the code skeleton suggests you to do a for loop.

You later convert that index to actual POS tag using the states values:

    # Convert the last word's predicted POS tag
    # from its unique integer ID into the string representation
    # using the 'states' list
    # store this in the 'pred' array for the last word
    pred[m - 1] = # your solution

And that is how you get your last tag (at pred[m-1])