In the week 2 assignment 1 Gender Debiasing problem, I see we are projecting the word vector space into a 2D representation with gender on one axis and all other 49 dimensions on another “orthogonal” axis.
How do we know for sure that “orthogonal” holds true? Is this an assumption or guaranteed somehow by the network that trained these vectors? What if this projected axis is intrinsically gender-biased? (And more generally, can we assume orthogonality between any two dimensions in this space, or in any arbitrarily dimensioned space, say 300, and why?)
I’d say that, considering the intricacies of meaning in language, orthogonality is an assumption. Using that assumption can nonetheless take the edge off some clear cases of bias, as discussed in the related lecture by Andrew Ng.
Andrew Ng also indicates that this is still an area of active research. The data used to train a model is one obvious factor to look into, and this is in line with Andrew Ng’s current focus on data-centric AI.
This answered my question. Thank you @reinoudbosch a lot for your answer! ( And mention of data-centric AI, which I definitely heard of but didn’t have an idea of what it is )
You’re welcome vdong. If you want to read up on data-centric AI you can have a look at the latest issue of the deeplearning.ai newsletter the batch.