C2_W1_Assignment : Understanding the custom dense function

Maverick03 · January 24, 2024, 11:41pm

I didn’t quite understand how the custom dense function is working. Here’s the function:

The input (a_in) to each neuron is of shape (400,1), and the weight matrix (w) for each neuron is (400, 25). So when we do z = np.dot(w, a_in), the multiplication should be possible only if column size of first matches row size of second. However, that’s not the case here. And we are also not transposing the matrix. But the my_dense function is still working correctly. So how is that happening?

TMosh · January 24, 2024, 11:51pm

A transposition or change of the order of operands is needed.
The information on the lecture slides is not necessarily accurate.

pastorsoto · January 25, 2024, 12:11am

Hi @Maverick03 great question

The confusion here seems to be about the dimensions of the matrices involved in the operation np.dot(w, a_in) within the my_dense function. The operation is indeed possible and correct given the dimensions of the matrices.

In the context of neural networks, a_in is the activation from the previous layer (or the input layer if it is the first hidden layer) and w is the weight matrix for the current layer.

a_in has the shape (400, 1), which means it is a column vector with 400 elements.
w has the shape (400, 25), which indicates that there are 25 neurons in the current layer, and each neuron has 400 weights corresponding to the 400 inputs.

When performing the operation np.dot(w, a_in), w should be transposed to match the inner dimensions for matrix multiplication. Typically, the weight matrix w would be of the shape (25, 400), so that when it is multiplied by a_in (of shape (400, 1)), the inner dimensions (400) match, and the resulting matrix is of the shape (25, 1), which is the activations of the current layer.

If the my_dense function is working correctly without explicitly transposing the matrix, then it’s likely that the weight matrix w is already defined in the transposed form (25, 400) in the function call. If that’s the case, then np.dot(w, a_in) would indeed give the correct result since the inner dimensions (400) match, resulting in a (25, 1) shape output corresponding to the activations of the 25 neurons in the current layer.

In summary, for the matrix multiplication to work without transposing, the weight matrix w must be defined as (number of neurons, number of inputs) which in your case would be (25, 400) and not (400, 25) as you have mentioned. There might be a misunderstanding in the shape description or the function is using the transposed weight matrix.

I hope this helps!

Maverick03 · January 25, 2024, 12:40am

It doesn’t seem like calling function is transposing the matrix:

Here X[0] is of shape (400,1).

I also didn’t get how a1, a2, which are of shape (1,) were valid inputs to next my_dense calls because of the same shape mismatch issue?

TMosh · January 25, 2024, 1:03am

I do not think that is true.
a1 must be the size of the 1st hidden layer, which as 25 units.
a2 must be the size of the 2nd hidden layer, which has 15 units.

rmwkwok · January 25, 2024, 2:38am

Hello @Maverick03,

I believe the more fundamental thing is that a_in is assumed to have the shape of (400, ), NOT (400, 1).

Applying np.dot to a (400, ) and a (400, 25) can work without problem, according to the 5th rule of the documentation.

Cheers,
Raymond

Topic		Replies	Views
Shouldn't we transpose w when taking dot product with a_in to respect matrix multiplication rules Advanced Learning Algorithms week-module-1 , ai-discussions	11	94	September 16, 2024
Course 2 Week 1: Neural network implementation in python, dense function question Advanced Learning Algorithms week-module-1	1	465	July 23, 2023
C2_W1_Assignment question Advanced Learning Algorithms week-module-1	2	451	August 22, 2023
Matrix multiplication lecture clarification - NN - Why do we transpose at all Advanced Learning Algorithms week-module-1	1	371	September 17, 2023
General implementation of forward propagation - shape of W Advanced Learning Algorithms week-module-1	9	473	February 17, 2024

C2_W1_Assignment : Understanding the custom dense function

Related topics