Are the formulas really correct and is the expected output really correct??
I tried different things (used np.linalg.norm or calculated everything without this function) but still can’t get the expected output.
Although their absolute values are, practically, identical, they do not match the “expected output” after equalizing (the values before equalizing match the expected output).
I am also getting
cosine similarities after equalizing:
cosine_similarity(e1, gender) = -0.7004364289309388
cosine_similarity(e2, gender) = 0.7004364289309388
I am also having trouble on this assignment, but in a couple different ways. First, I when I implement the equations, I get a calculation error. That is, the result is NaN. The problem appears to come from the sqrt(1-|mu_orth|^2).
I double and triple checked my calculation of mu_orth and cannot find a mistake. Furthermore, I don’t see how the equations could be correct. mu_orth could sometimes be greater than 1, which would make the square root here imaginary (or undefined, here NaN). I looked at the reference paper and, I think, some basis set vectors were normalized (i.e., length of 1), which would guarantee that mu_orth was less than or equal to one.
To avoid the square root of -something, I replaced 1 with |mu|^2. I didn’t get the results expected, but they were equal magnitude, opposite sign results.
I believe that the formula in the notebook is sqrt(|1-norm^2(mu_orth)|), i.e. the absolute value is applied to 1-norm^2(mu_orth).
However, I am not sure how this formula was derived.
Ahhh. I see that now. Thanks. I am now getting what everyone else in this thread calculated. Much appreciated.
However, looking at the reference paper, I don’t see how that absolute value can possibly be correct. The language is a little different (more general but also more complicated) in the paper, but the equivalent equation in the paper doesn’t have an absolute value. I suspect someone was having my problem and “fixed” it without understanding the underlying theory.
I believe you’re right. I’ve read the paragraph where the formulae are presented in the article and understood it in the following way.
The new ‘equalized’ embedding is presented as the sum of two orthogonal vectors: mu_orth (which is v in the article) and (w_b - mu_b). The new embedding is scaled so that it has unit length. In order to achieve this, the second term (w_b - mu_b) is multiplied by a coefficient k, which we’re going to derive. We start with e_new = mu_orth + k*(w_b - mu_b).
The norm of the sum equals sqrt(|mu_orth|^2 +k^2*|w_b - mu_b|^2). For this expression to equal 1, we have to solve the equation |mu_orth|^2 +k^2*|w_b - mu_b|^2 = 1 for k. From here we get k = sqrt(1 - |mu_orth|^2) / *|w_b - mu_b|, which is what we see in the formula. This only works if |mu_orth|^2 is less than 1. This is probably true in the case of the article, because their original embeddings are unit vectors (see page 3).
However, in our homework exercise the norm of the embeddings is greater than one. I checked this for np.linalg.norm(word_to_vec_map[“man”]). If |mu_orth| > 1, it’s impossible to get a unit vector by adding another orthogonal vector to mu_orth. I don’t see where the absolute value under the square root sign is coming from. It looks like after all the formulae (9) and (10) are not correct.
Thank you this helps to understand where the sqrt(1 - norm(mu_orth) ^2) comes from !
Actually, the simplification is then like this :
e_w1B = e_w1 - mu_orth and corrected_e_w1B = k * (e_w1B - mu_B) which is as given : corrected_e_w1B = sqrt(abs(1 - norm(mu_orth) ^ 2)) / norm(e_w1B - mu_B) * (e_w1B - mu_B)
And it gives for me the same as the others :
cosine similarities after equalizing: cosine_similarity(e1, gender) = -0.7004364289309387 cosine_similarity(e2, gender) = 0.7004364289309387
as an old learner, I remember there was a post addressing this problem in the old Discussion Forum and it corrected the equitions provided in Exersice instruction.
Unfortunately, I cannot find that post because all the old posts seem gone. But if you can find those old posts, maybe it will help.
Thank you for letting us know about this. However, I have no access to the original version of the Discussion Forum. I am sure more information will be given by the staff team soon. Let’s wait for their answers.