Hello everyone,
I do not have a problem with implementing the code, but I am getting very confused on the matrix dimensions.
In Exercise 3, we get to know that a_prev is an (m, n_H_prev, n_W_prev, n_C_prev) array where,
m = No. of Training Examples
n_H_prev = Height of the Training Examples
n_W_prev = Width of the Training Examples
n_C_prev = No. of Channels (RGB = 3)
We also get to know that W is an (f, f, n_C_prev, n_C) array where,
f,f = Dimension of the filter (W)
n_C_prev = No. of Channels for each filter (RGB = 3)
n_C = Total no. of filters.
In the exercise, we have the dimension of a_prev and W to be (2,5,7,4) & (3,3,4,8). We can see that both matrices are 4D but how we visualize the matrices are different.
a_prev: Training Examples=2, Height=5, Width=7, Depth=4.
If this is how we are interpreting a 4D matrix, then shouldn’t the W matrix be visualized in this manner?
Instead, according to the exercise, W should be interpreted as
It seems that we are interpreting a 4D matrix in two different ways which does not make much sense to me.
Someone please clarify this for me!
Thanks for the excellent pictures! We are visual thinkers, so that always helps. Of course our vision is geared to 3 dimensions, so it’s challenging to visualize things in 4 or more dimensions, which is the nub of the issue here.
Let’s talk terminology first. A matrix is by definition a 2 dimensional array. An array may have an arbitrary number of dimensions. So we have this containment relationship:
vectors \subset matrices \subset arrays
By analogy, every square is a rectangle, but not every rectangle is a square, right? Every vector is a matrix, but not every matrix is a vector.
We will soon start using TensorFlow and there the terminology changes to call arrays “tensors”. So we are dealing with 4D arrays or 4D tensors here.
Now to your real question: The key point is that how you visualize them does not have to be the same just because they have the same number of dimensions. It depends on the meaning of the various dimensions. In the case of the input sample arrays, the first dimension is the “samples” dimension, so it makes perfect sense to simplify by eliding the first dimension and thinking of it as an array of m 3D tensors of shape h x w x nC_{prev}. But in the case of the W 4D arrays, we use them differently: the first 3 dimensions of W match the last three dimensions of A, so it makes the most sense to think of W as 8 of the 3D arrays with the shape h x w x nC_{prev}.
2 Likes
I see! That has helped me clear my doubt! Thank you so much!