'g' in neutralize() week 2 assignment

Nidhi_Sachdev · July 24, 2022, 2:21am

Hi,

In the first programming assignment for week 2, there is an array ‘g’ passed into the function neutralize(), which corresponds to the axis we want to neutralize. It’s dimension is (50,) How is this ‘g’ determined?

From what I could understand, word embeddings with 50 ‘features’ would have dimensions (50, vocab_size). And the embedding corresponding to a given ‘feature’ (say gender) would have shape (1, vocab_size). But clearly this is not the case for ‘g’ and I didn’t quite understand why. I would expect (50,) to be the shape for a word in the vocabulary, say, ‘receptionist’…

I could use some help to understand how ‘g’ is computed.

Thank you!
Nidhi

anon57530071 · July 24, 2022, 3:27am

In this assignment, g is calculated in the previous cell as follows.

g = word_to_vec_map['woman'] - word_to_vec_map['man']

Each word vector consists of 50 features as you wrote. And, this is a simple subtraction. So, g has the same dimension as any word vectors, i.e, (50,).

This is based on assumptions like these;

There is a gender bias in words.
Let’s pick up “man” and “woman” for this exercise. Essentially, the difference of vectors between “man” and “woman” is caused by a gender. In this sense, other features than “gender” may be similar.
So, if we subtract a vector for “man” from a vector for “woman”, then, the remaining is pretty much focusing on the gender. That’s g for this exercise.

So, we assume that g can be used to dig which words have gender bias with using cosine similarity. “neutralize” is to use this g to remove “gender bias”.

You may understand whole picture above, and just miss one cell to calculate g, but the above is a whole story. Hope this helps.

Nidhi_Sachdev · July 24, 2022, 3:53am

Yes, that clarifies it indeed. I somehow missed the cell computing ‘g’, but even so, your explanation is more helpful than just seeing it’s computed value. Thanks!

Topic		Replies	Views
DLS Course 5 Week 2 Assignment 1 Debiasing: Neutralize Sequence Models	2	717	November 29, 2021
Implementing e_biascomponent Sequence Models	4	312	November 16, 2023
A note on the featurization view of word embeddings Sequence Models	4	404	July 31, 2023
C5W2 Assignment 1 Debiasing - About "orthogonal axis" Sequence Models	3	550	September 30, 2021
C5W2 Assignment 1 - debiasing result is not zero Sequence Models week-2	1	15	September 26, 2024

'g' in neutralize() week 2 assignment

Related topics