Rows or Columns?

EMIN_MAMMADZADA · April 12, 2024, 10:53am

Hey ,
In c1w4 programming assignment we see below comment:

In my opinion, here the word columns causes confusion ( I tried transposing matrix X and Y). Because rows in X or Y matrix were embeddings corresponding to spesific word. Would like to know if using columns right way to explain?

arvyzukai · April 12, 2024, 11:39am

Hi @EMIN_MAMMADZADA

I’m not sure. I would agree that the docstring could be better. X and Y has the shape (word_count x embedding_size) and that would be a clearer expression.

Columns represent each embedding feature value (in this case 300 columns = 300 features, a row vector), and, as I understand, the docstring mislead into thinking that each column to represent a word with 300 rows (a column vector).

But if you look at the instructions, they are clearer:

Returns:

Matrix X and matrix Y, where each row in X is the word embedding for an english word, and the same row in Y is the word embedding for the French version of that English word.

Use the en_fr dictionary to ensure that the ith row in the X matrix corresponds to the ith row in the Y matrix.

Cheers

P.S. Maybe native English speakers would clarify this?

paulinpaloalto · April 12, 2024, 2:54pm

I totally agree that the docstring is confusing and arguably just wrong. To confirm, I added print statements after the unit test cell and here’s what I get:

X_train.shape (4932, 300)
Y_train.shape (4932, 300)

So of the 5000 word pairs in the English to French dictionary, only 4932 of them have embeddings in both languages.

I think I have access to the git repo for NLP, so I will file a bug about this.

Update: actually there’s another problem with that docstring as well: it mentions R as a return value, but that is no longer part of this function.

jyadav202 · April 12, 2024, 3:48pm

Hi @EMIN_MAMMADZADA ,

The description of X and Y are correct. A matrix (2D) in our case has rows which corresponds to the En/Fr word (data point) and columns to the embeddings (features).

In ML, rows of matrix are represented by the data point and columns as its features.
It might help you to think in terms of another usecase like classifying fruits based on its features like shape, colour, odor, etc. The rows will be fruit A,B… and columns will be shape, colour, odor…
Similarly, in our case, the data point is a word (Either En or FR) and its features are its embeddings (there are 300 features). So the row is a word and column its embedding.

@paulinpaloalto the ‘R’ should be removed though

Topic		Replies	Views
Week 4 assigment Browser-based Models with TensorFlow.js week-4	3	340	December 4, 2023
C5 Week 2 dimensions of embedding matrix Sequence Models	4	634	May 24, 2021
Confusing about structure of matrices X and W Neural Networks and Deep Learning	3	1020	November 23, 2022
Are examples in columns and features in rows? Neural Networks and Deep Learning	1	680	July 2, 2023
C1W2 - How X matrix represents m training examples Neural Networks and Deep Learning week-2	3	19	October 13, 2024

Rows or Columns?

Related topics