I looked up the equations given number 9 and 10, and the terms surrounded in red are the reason i am gettin NaN values for ouput because therse is no square root for a negative value, and for that particular part and the values given in the test we will always get a negative

The norm of u1 is not a negative number.

i fixed the problem i got the following results

**cosine similarities before equalizing:**

**cosine_similarity(word_to_vec_map[“man”], gender) = -0.11711095765336832**

**cosine_similarity(word_to_vec_map[“woman”], gender) = 0.35666618846270376**

**cosine similarities after equalizing:**

**cosine_similarity(e1, gender) = -0.008596011598445372**

**cosine_similarity(e2, gender) = 0.007876829209969715**

are these correct results ?

They show you the expected values and they don’t look like yours after equalizing. Here’s what I got:

```
cosine similarities before equalizing:
cosine_similarity(word_to_vec_map["man"], gender) = -0.1171109576533683
cosine_similarity(word_to_vec_map["woman"], gender) = 0.35666618846270376
cosine similarities after equalizing:
cosine_similarity(e1, gender) = -0.23871136142883795
cosine_similarity(e2, gender) = 0.23871136142883792
```

That agrees with the expected values that they show. Please have another careful look at your code vs the formulas that they wrote out for us.

One thing to be careful of is the notation there for the norms. They like to use the “sub 2” everywhere to indicate that these are “2-norms”. But notice that sometimes they also square the norms. If it says this:

||v||_2

then that is just a plain norm given by `np.linalg.norm(v)`

. But if they say this:

||w||_2^2

then it is the *square* of the 2-norm of w. So you have to observe carefully which is being asked for.

One other implementation note about computing the square of a 2-norm. You can do that by writing

`np.square(np.linalg.norm(w))`

but that is very inefficient, because you compute the sum of the squares, take the square root of that and then square it. Taking square roots is computationally expensive and you’ve completely wasted that compute if what you really wanted was the square of the norm. The more efficient way to compute that would be:

`np.sum(np.square(w))`

So you’re just doing the square of all the elements and then adding them up.