Weight matrix dimension in TensorFlow

Hi,

When I looked at the weight matrix dimension in TensorFlow, it looks like it follows [n_l-1, n_l] where ‘n_l-1’ is the input layer and ‘n_l’ is the hidden layer dimension.

From lectures, I was thinking of it should be [n_l, n_l-1]. so is TensorFlow taking the transpose of weights during calculations or elementwise multiplication with input? or I am missing something?

I just tried to manually take the dot product between inputs to initiate the weight matrix and match the outputs with the model prediction, they are the same. I am somewhat confused with the dimensions in TensorFlow. from initial lectures I thought the input dimension is (n_x, m) but in TensorFlow, it seems (m, n_x), where ‘n_x’ is input length (here 4 ) and ‘m’ is a number of examples.

Thanks

Hey @siddhesh_mane,
It’s all about the convention followed. The lecture videos follow a particular convention, and Tensorflow follows another, that’s it. You will find that the core concepts still remain the same.

In the lecture videos, the input has the dimensions (n_x, m), the weight matrix has the dimensions (n_y, n_x), where n_y is the output number of neurons. The computation that happens is W \ x, and the output has the dimensions (n_y, m).

In tensorflow, the input has the dimensions (m, n_x), the weight matrix has the dimensions (n_x, n_y), once again where n_y is the output number of neurons. The computation that happens is x \ W, and the output has the dimensions (m, n_y).

So, you see the concept remains the same, just the difference in the shape, which includes the difference in shapes of the outputs as well.

For verifying the last part, feel free to run the below piece of code

model = tf.keras.Sequential([
    tfl.Dense(2, input_shape = (4,))
])

print("Shape of the weights:")
print(np.array(model.weights[0]).shape)

x = np.random.randn(10, 4)
print("Shape of the input:")
print(x.shape)

y = model(x)
print("Shape of the output:")
print(y.shape)

I hope this helps.

Cheers,
Elemento

1 Like