Matrix lay out in the tensorflow

In the Matrix multiplication code of week 1,

Andrew said:
“There is a convention in TensorFlow that individual examples are actually laid out in rows in the matrix X rather than in the matrix X transpose which is why the code implementation actually looks like this in TensorFlow.”

Does this means the w should be in the transpose?

I am not quite getting it…

Chatgpt is saying that it’s because in math we use x-axis(column value n) first, and then follows y-axis(row value m) and in tensorflow we always do [m,n] instead of [n,m]

just want to be clear, the red line is draw by me, not andrew

There is little convention on the orientation of the weight matrix.

What exactly is your question?

I recommend you not trust a chat tool to give truthful answers.

why we are using w = [[1,-3,5], [-2,4,-6]] in the tensorflow? isn’t we have 2 x features a= [[200, 17]]? so the w should in the form of w = [[1,-2],[-3,4],[5,-6]]?

Also I am getting what andrew is saying here“There is a convention in TensorFlow that individual examples are actually laid out in rows in the matrix X rather than in the matrix X transpose which is why the code implementation actually looks like this in TensorFlow.” What is the convention? can you give me the example? Thx

“why we are using w = [[1,-3,5], [-2,4,-6]] in the tensorflow? isn’t we have 2 x features a= [[200, 17]]? so the w should in the form of w = [[1,-2],[-3,4],[5,-6]]?”

regarding my question above, is because we are using the matmul here? so the shape should be match like a.shape should be 1*2 and w.shape needs to be 2 * 3 to meet the matrix multiplication rule? (sorry not good in math…) And if we are not using matmul, we use for loop then the w shape w = [[1,-2],[-3,4],[5,-6]] is fine right?

for m in range(w.shape[0]):
g = sigmoid(np.dot(a,w[m]) + b)

TensorFlow likes the input data to be organized as size(m, n), where ‘m’ is the number of examples, and ‘n’ is the number of features.

Ideally, the weight matrix would be defined so that you can use np.matmul() without any transpositions. But there is really no universal convention - it will vary depending on who designed the model you’re working on.

So, sometimes you’ll need to transpose, and sometimes you won’t. You need to look at the dimensions of the data matrix and the weight matrix, and act accordingly.

1 Like

If you know that the weight matrix is a vector, then you can use np.dot() instead of np.matmul().

The drawback to np.matmul() is that it does not allow for scalar operands.

1 Like