C3W4: trying to understand GradCam

Hi,

I am trying to understand the mechanism of the GradCAM. From the video and the Colab practice, I see that the “intermediate class activation map” is almost empty for the last layer and more clear for earlier layers. My impression on conv layers is that the earlier layers learned low-level features and later layers learned higher-lever features. Why the “class activation map” for the earlier layer are more clear? Thanks for any devices.

Best,
WJ

Hi @WenjiangHuang,
You’re right that in the earlier layers we are just learning low-level features, so there is still a lot to learn, which means there are still a lot of parts of the image to consider to help us determine what the image is a picture of. That’s why there are lots of hot-spots in the early class activation maps. By the time we get to the last layer, we have learned a lot, so there are only a few small hot spots left that we need to consider to help us determine what class the image is a picture of.

Thanks. It makes sense to me now.

Best,
WJ