Hi team,
I want to have more clarity on how we form inputs and filter inputs into the convolution algorithm. I will take an example from the Lab code:
A_prev = np.random.randn(2, 5, 7, 4)
W = np.random.randn(3, 3, 4, 8)
It seems like the A_prev array structure and W structure look like this:
Where A_prev - is an array having two images of (5,7,4) size
W - is an array having 3 arrays of (3, 4, 8) size arrays, each of which is a stack of 8 channels. To make convolution happen, I need to compose/reshape the original W array into 8 filters having sizes of (3,3,4) to match the 3-dimensional slice taken from A_prev.
How to interpret the W ? Why wouldn’t we just make W of size (8, 3, 3,4 ) at the beginning to make life easier?
If we are taking the for-loop approach, then having either (3, 3, 4, 8) or (8, 3, 3, 4) as the shape of W will have the same effect (speed difference to be tested).
However, if we are taking a vectorized approach, then I would go for (3, 3, 4, 8). There are 2 ways you can do it in numpy, given a slice of A_prev and the W. I will tell you the idea.
Approach 1
Given a slice of A_prev with shape (2, 3, 3, 4). GIven the W with shape (3, 3, 4, 8).
apply np.expand_dims on both to make them (2, 3, 3, 4, 1) and (1, 3, 3, 4, 8) respectively.
do a element-wise multiplication and then sum over some axes to get the outcome of this filtering over the 2 samples and the new 8 channels. The outcome will have a shape of (2, 8).
This approach takes 8.62 µs ± 228 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
Approach 2
Given a slice of A_prev with shape (2, 3, 3, 4). GIven the W with shape (3, 3, 4, 8).
apply np.einsum on them correctly.
This approach takes 1.87 µs ± 32.8 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
Summary
I think (3, 3, 4, 8) is sufficient for the “for-loop” approach, and is better for the vectorized approach. Also, (3, 3, 4, 8) is also the arrangement used in Tensorflow.
Cheers,
Raymond
PS: We can also use np.tensordot but it is neither the fastest nor slowest.
Hi Raymond,
Thanks for the quick turnaround! I have accomplished the week one task for convolution and I used For loop approach> I literally decomposed the W with shape (3, 3, 4, 8) and looped over it:
I will put a vectorized version of the code into my notes. Very cool stuff.
Are there any “community.deeplearning.ai” NN projects which I can volunteer for?
I don’t know of anything like that specific to community.deeplearning.ai, but I have seen a few people post requests for research collaborators in the “General Discussion” forum. You can do a search on the forums and see if you find any such that are recent.
There are other places also worth a look. I am on the mailing list for Omdena and they seem to be actively recruiting people to participate in various projects, but I have not actually looked into the details on any of them as yet. You might have a look at their website and see if what you think. Kaggle is also a source for challenge projects, but I’ve never really done one of their contests. I don’t know if they have a forum where you can try to put together a project team.
Let us know if you find anything that looks interesting!