Problems calculating transition matrix A

drew_Frances · January 31, 2022, 3:47pm

I am calculating the values for transmission matrix A. My values are orders of magnitude off.

I get:

A at row 0, col 0: 0.021739130 , A at row 3, col 1: 5021.7609

expected output:

A at row 0, col 0: 0.000007040, A at row3, col1: 0.1691

my equation is:

A[i,j] = (count + alpha) / (count_prev_tag + (alpha * num_tags))

All the code leading up to here, works. I am stuck. I not sure what I am doing wrong? It must be something obvious.

P.S - despite changes, my values don’t seem to change when I re-run the example

arvyzukai · February 18, 2022, 1:46pm

Hi.

There is nothing wrong with your given equation. You must be calculating wrong one or more of the variables inside the equation. When you get stuck I would suggest to make use some print() statements (or breakpoint() if you know what you are doing) to track if you get the right values for them - in other words I would suggest to calculate some tags’ transitions manually and check where is the mistake.

For example, if you would use these print() statements:

print(f'i:{i}, j:{j}, count:{count}, alpha:{alpha}',)
print(f', count_prev_tag: {count_prev_tag}, num_tags: {num_tags}')

you should get:

i:0, j:0, count:0, alpha:0.001
count_prev_tag: 142, num_tags: 46

to check manually you would calculate:

(0 + 0.001) / (142 + 0.001*46) = 0.000007040

Expected Output:
A at row 0, col 0: 0.000007040

If you are curious which tags’ value is this you could:
trans_df.iloc[0:2, 0:2]

and get:

| |# |$|
|---|---|---|
|# |7.039973e-06 |7.039973e-06|
|$ |1.356476e-07 |1.356476e-07|

So for tag ‘#’ to transition to tag ‘#’ there is 0.000007040 chance. (P.S. This is somewhat bad example because this probability is exactly the same to most tags from ‘#’ and there is probability of 0.992643 transitioning to ‘CD’ tag but I think you get the idea).

drew_Frances · February 21, 2022, 12:07am

Arvyzukai, thanks! I found the problem. I was using the wrong key to get count_prev_tag! The defaultdict meant I got a zero so I didn’t think twice!

Cheers,
Andrew

Daniel_Hecko · April 21, 2022, 8:27am

Thank you for both the question and an answer, it helped a lot

Topic		Replies	Views
NLP with Probabilistic models - C2_W2_Assignment NLP with Probabilistic Models week-module-1	1	552	November 4, 2022
C2_W2 create_transition_matrix (Errors) NLP with Probabilistic Models week-module-2	10	577	August 16, 2023
Small Typos in C2 W2 Assignment NLP with Probabilistic Models week-module-2	1	594	December 18, 2022
Viterbi- forward step: matrix A value extraction NLP with Probabilistic Models week-module-2	1	255	February 16, 2024
C2_W1 assignment: Exercise 11 NLP with Probabilistic Models	9	207	June 23, 2024

Problems calculating transition matrix A

Related topics