Hey Guys,
I observed the following discrepancies in the C2 W2 Assignment. Please take them into account for future revisions.
Part 0: Data Sources
The training set will be used to create the emission,
transmissiontransition and tag counts.
… training and test corpus and will appear in the emission,
transmissiontransition and tag data structures.
- In the image in this section as well, it is mentioned
transmission_counts
, whereas, it is supposed to betransition_counts
.
Exercise 01
You can get a more complete description at Penn Treebank II tag set.
The above URL doesn’t work any more.
Exercise 03
Now you will create the
B
transitionemission matrix which computes the emission probability.
Part 3.2 Viterbi Forward
Compute the probability that the tag of the second
workword (‘tracks’) is a verb, 3rd person singular present (VBZ).
Exercise 06
- In the image above the UNQ C6, the figure depicts the
best_paths
matrix. - In that, we have mentioned the value for
(--s--, 'tracks')
as32
. - Now, this matrix stores the indices of the most likely tags for the previous words, however, in the figure, no tag has been indicated with the index
32
. - It would be great if this index can be changed to any other index mentioned in the figure, to avoid any confusion whatsoever.
Cheers,
Elemento