C2_W1_Assignment : Understanding the custom dense function

I didn’t quite understand how the custom dense function is working. Here’s the function:

The input (a_in) to each neuron is of shape (400,1), and the weight matrix (w) for each neuron is (400, 25). So when we do z = np.dot(w, a_in), the multiplication should be possible only if column size of first matches row size of second. However, that’s not the case here. And we are also not transposing the matrix. But the my_dense function is still working correctly. So how is that happening?

A transposition or change of the order of operands is needed.
The information on the lecture slides is not necessarily accurate.

1 Like

Hi @Maverick03 great question

The confusion here seems to be about the dimensions of the matrices involved in the operation np.dot(w, a_in) within the my_dense function. The operation is indeed possible and correct given the dimensions of the matrices.

In the context of neural networks, a_in is the activation from the previous layer (or the input layer if it is the first hidden layer) and w is the weight matrix for the current layer.

  • a_in has the shape (400, 1), which means it is a column vector with 400 elements.
  • w has the shape (400, 25), which indicates that there are 25 neurons in the current layer, and each neuron has 400 weights corresponding to the 400 inputs.

When performing the operation np.dot(w, a_in), w should be transposed to match the inner dimensions for matrix multiplication. Typically, the weight matrix w would be of the shape (25, 400), so that when it is multiplied by a_in (of shape (400, 1)), the inner dimensions (400) match, and the resulting matrix is of the shape (25, 1), which is the activations of the current layer.

If the my_dense function is working correctly without explicitly transposing the matrix, then it’s likely that the weight matrix w is already defined in the transposed form (25, 400) in the function call. If that’s the case, then np.dot(w, a_in) would indeed give the correct result since the inner dimensions (400) match, resulting in a (25, 1) shape output corresponding to the activations of the 25 neurons in the current layer.

In summary, for the matrix multiplication to work without transposing, the weight matrix w must be defined as (number of neurons, number of inputs) which in your case would be (25, 400) and not (400, 25) as you have mentioned. There might be a misunderstanding in the shape description or the function is using the transposed weight matrix.

I hope this helps!

It doesn’t seem like calling function is transposing the matrix:

Here X[0] is of shape (400,1).

I also didn’t get how a1, a2, which are of shape (1,) were valid inputs to next my_dense calls because of the same shape mismatch issue?

I do not think that is true.
a1 must be the size of the 1st hidden layer, which as 25 units.
a2 must be the size of the 2nd hidden layer, which has 15 units.

Hello @Maverick03,

I believe the more fundamental thing is that a_in is assumed to have the shape of (400, ), NOT (400, 1).

Applying np.dot to a (400, ) and a (400, 25) can work without problem, according to the 5th rule of the documentation.