Andrew said:
“There is a convention in TensorFlow that individual examples are actually laid out in rows in the matrix X rather than in the matrix X transpose which is why the code implementation actually looks like this in TensorFlow.”

Chatgpt is saying that it’s because in math we use x-axis(column value n) first, and then follows y-axis(row value m) and in tensorflow we always do [m,n] instead of [n,m]

why we are using w = [[1,-3,5], [-2,4,-6]] in the tensorflow? isn’t we have 2 x features a= [[200, 17]]? so the w should in the form of w = [[1,-2],[-3,4],[5,-6]]?

Also I am getting what andrew is saying here“There is a convention in TensorFlow that individual examples are actually laid out in rows in the matrix X rather than in the matrix X transpose which is why the code implementation actually looks like this in TensorFlow.” What is the convention? can you give me the example? Thx

“why we are using w = [[1,-3,5], [-2,4,-6]] in the tensorflow? isn’t we have 2 x features a= [[200, 17]]? so the w should in the form of w = [[1,-2],[-3,4],[5,-6]]?”

regarding my question above, is because we are using the matmul here? so the shape should be match like a.shape should be 1*2 and w.shape needs to be 2 * 3 to meet the matrix multiplication rule? (sorry not good in math…) And if we are not using matmul, we use for loop then the w shape w = [[1,-2],[-3,4],[5,-6]] is fine right?

for m in range(w.shape[0]):
g = sigmoid(np.dot(a,w[m]) + b)

TensorFlow likes the input data to be organized as size(m, n), where ‘m’ is the number of examples, and ‘n’ is the number of features.

Ideally, the weight matrix would be defined so that you can use np.matmul() without any transpositions. But there is really no universal convention - it will vary depending on who designed the model you’re working on.

So, sometimes you’ll need to transpose, and sometimes you won’t. You need to look at the dimensions of the data matrix and the weight matrix, and act accordingly.