I have done week1, and I have suddenly had a question after knowing that conv will take relu which lead me back to his C4W1L03 video which talk about taking absolute value

My opinion is that it is crucial to take absolute, since a convolution layer will take relu at the last step, hence the edge strength with high negative value will turn to zero, which we don’t want to ignore that high edge strength so… we have to take absolute

But, Andrew NG said we CAN just take absolute value. I mean it’s a bit ambiguous that why don’t he state WE MUST to take absolute after summation for cnn field. Like it’s compulsion for cnn to take absolute

I’m I right ? taking absolute is a crucial part of convolution in cnn to avoid relu side effect

When we say we must, then we do not have any alternatives. However, we do have an alternative in your example which is by fliping over the filter itself. In the bottom example, if we replace the filter with another filter that has -1 in the 1st column, 0 in the 2nd, and 1 in the 3rd, then the result is won’t be zeroed by ReLU.

Speaking of taking absolute value, while we can do it, we also make a left edge to be not distinguishable from a right one. Whether this is a loss is to be told!

Lastly, if we have allocated enough filters in our CNN, then we can let filters evolve to each represent a different feature. The filter in your screenshot can do well in detecting a left edge, while the “flipped” version of it can do well for a right edge. This isn’t bad.

In addition to @rmwkwok 's answer, I’d like to add that in practice we don’t define the filters as shown in the lectures, but instead with a random distribution, and the CNN will learn the best weights on the filters as it goes through the training. So the convenient matrices shown in the lectures have to be taken more as a way to build intuition than anything else.

For further details, you may want to read this post on the topic.