[Week 4] Exercise 1 - get_angles

I got the impression, I am interpreting the function wrong. A hint would be appreciated.

R​unning the unit-test, I get stuck with “Submatrices of odd and even columns must be equal” . I cannot understand why that would be the case. As far as I understand: pos/(10000^((2i)/d))* :

Since the shape is meant is to be (4, 16), columns should refer to increasing values of i, which are increasing the power of 10.000, decreasing the overall values and therefore should not be be equal to one another. Even columns should refer to even i and vica versa for odd columns. Therefore even columns should contain less numbers after the decimal place than odd ones. I.e for i= 2, I get a the vector (0, 0.1 ,0.2, 0.,3), for i=1 , I get (0, 0.316…, 0.632…, 0.948…)…

I​ am even further puzzled, when I ignore that test and look at this test:

limit = (position - 1) / np.power(10000,14.0/16.0) assert np.isclose(result[position - 1, d_model -1], limit ), f"Last value must be {limit}"

W​here this calculation refers to me to a function as follows: pos/1000^((i-1)/d) , since my last value for position is 3 and for i is 15. The last value of my calculated array would correspond therefore to: (position - 1) / np.power(10000,(15.0*2)/16.0)

H​owever, changing the function to the one deduced from the second mentioned test: pos/1000^((i-1)/d) does not fix the equal column problem either…

S​o I am a bit in limbo here.

Same here. I’m not sure that the following assertion should be true
assert np.all(even_cols == odd_cols), "Submatrices of odd and even columns must be equal"
furthermore, the strange value of i = 14 instead of 15 for the last column in a 4x16 matrix in the following assertion value
limit = (position - 1) / np.power(10000,14.0/16.0)

I’ve calculated the angles values in a spreadsheet using the formula

angle = pos / (10000**(2*i/d))

for the unit test where d = 16, i = 0 to 15 and pos = 0 to 3 and got the exact same result as in the notebook

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 1 0.316227766 0.1 0.031622777 0.01 0.003162278 0.001 0.000316228 0.0001 3.16228E-05 1E-05 3.16228E-06 1E-06 3.16228E-07 1E-07 3.16228E-08
2 2 0.632455532 0.2 0.063245553 0.02 0.006324555 0.002 0.000632456 0.0002 6.32456E-05 0.00002 6.32456E-06 0.000002 6.32456E-07 0.0000002 6.32456E-08
3 3 0.948683298 0.3 0.09486833 0.03 0.009486833 0.003 0.000948683 0.0003 9.48683E-05 3E-05 9.48683E-06 0.000003 9.48683E-07 0.0000003 9.48683E-08

What am I missing?

I found this link which might be useful:

4 Likes

@ashkanbj it worked using the integer division. But it doesn’t make any sense to me! :man_shrugging: :roll_eyes:

Update:
Oh! got it now :grin:
it’s because of those values should have the same angles

𝑃𝐸(𝑝𝑜𝑠,2𝑖)
𝑃𝐸(𝑝𝑜𝑠,2𝑖+1)

3 Likes

Thank you for that! Floor division it is a smart way to combine those two functions. I would have tried to work that out later.

1 Like

Hi,

I have a problem. I use the following code:
angles = pos/(10000**(2*i//d))
However it says the last value should be:
0.0009486832980505137
I dont what should I do. Can anyone help?

Capture1

1 Like

Try representing the “exponent” term as float and see if it helps.

It didn’t work. :frowning:

angles = pos/(10000**float((2*i)//d))

Yes. But you have now a different new error :grin:.
Think about it for a minute. I’ll give you a hint, you can’t use the python’s generic float class to cast np.arrays other than 1d arrays. you are still gonna use it. But the question now is where? Hopping that I’m nudging you on the right direction.

1 Like

Thanks for the hint.
I’ll keep working on it! :wink:

I got this to work using the hints given in this thread, but it seems like what you have to do does not match up with what is given in the assignment prompts and lecture. Am I missing something?

4 Likes

Not really. This particular detail was left out for you to figure out. There is nothing contradicting the lectures nor the assignment instructions here.

Hello everyone,
I also struggle with this exercise. I don’t understand why and how odd and even columns would be the same.
If you use floor division you end up with an exponent equal to either 0 or 1, why would this be useful ?
In addition I don’t understand the computation of the exponent of the limit in the test:
limit = (position - 1) / np.power(10000, 14.0/16.0)
Why 14.0 ? From what I understood it should be 14 * 2 = 28. And if we really have to use floor division, why do we not use it here ?

Hi Maxime,

The get_angles() function is to calculate angle, rather than PE (position encoding). Take a look at equations:
image
Each pair of (even, odd), i.e., (0,1), (2,3), (4,5)…, (2i, 2i+1) for i = 0, 1, 2, …, i, respectively, even and odd have the same angle (red box).

Regarding the limit (the value at last position and last dimension), the test case generated positions from 0 ~ (position - 1), and dimensions from 0 ~ (d_model -1). So the last value is at (position - 1) and (d_model - 1 = 15). Thus, in the odd equation (pos, 2i+1) = (3, 15). So, i = 7, the angle is: 3/np.power(10000, 2*7/16). Or, you can simply take floor of (d_model - 1)//2 to calculate i=7.

2 Likes

Hi everyone.
I am stuck with the same problem here. Is there anything I can do?

I finally got something that worked with the grader by trial and error, but this is extremely poorly explained. The expression stated for calculating the angle is not mathematically correct. One thing these courses need to do a much better job of is clearly defining when something is a matrix multiplication, piecewise multiplication, or something else.

3 Likes

Can you explain more about how you managed to solve the issue?

1 Like

Try to use (i//2) not i/2?

Hi. Thanks for the tip, however, when I used this, I got another error:

AssertionError: Last value must be 0.0009486832980505137

Do you have any idea about this one?

3 Likes

Hi Fady, could you explain a bit more on the hint? thanks