In the things to remember at the end of programming assignment 1 in week 1 of DLS course 4 on CNN it suggests that “A convolution extracts features from an input image by taking the dot product between the input data and a 3D array of weights (the filter).” However the math used in this section and the slides and videos never mentioned using a dot product, and in fact we only used simple multiplication.
I don’t see where the concept of a dot product was ever introduced in this section.
I agree with you.
It seems to be a convention in the literature of CNN’s to refer to this element-wise product and sum process as a “dot product” when discussing convolution, even though it technically isn’t the same as the dot product you’re familiar with from 2D linear algebra.
This is a good point. Their wording is maybe a bit awkward. But note that the operation is more than just an elementwise multiply of the filter with the input: you then add up all those products to produce a scalar result for each position in the output. So it does sort of have the flavor of a dot product (elementwise multiply followed by addition), even if it isn’t exactly equivalent to the normal dot product. But now that I think \epsilon harder, what happens if you apply
np.dot in the case where the inputs have more than two dimensions? Maybe we just need to understand what the definition of that would be in higher dimensions …
Thank you both, that help clarify things.
I was trying to reproduce the values we get using numpy element-wise multiplication followed by summing all the values (one window at a time), on a matrix foo and a kernel bar - both 2d matrices: It did not work when using np.dot(foo, bar) even if they were the same size.
I could do it as follows:
>>> import scipy.signal as sps
>>> z = sps.correlate2d(foo, bar, mode='same') # to use padding and keep the original size
>>> z = sps.correlate2d(foo, bar, mode='valid') # to allow the size to shrink
That’s what Paul is referring to.