GradCAM vs Image Segmentation


I have a general question regarding GradCAM vs Image Segmentation. It seems like GradCAM is able to locate the disease area based on the CNN model used in learning multiclass disease labels. It doesn’t require retraining a new model. Image Segmentation on the other hand do need labelled data on pixel ( or voxel ) label across all disease classes. Then wouldn’t GradCAM be enough already to do the job of highlighting the disease? Why would people need Image Segmentation here? I understand Image Segmentation is useful for car auto-driving objection detection because it require precise boundary of detected object for safe driving. But here in medical images, it seems like GradCAM can do the job already.

Another question for Image Segmentation specifically. Based on my understanding, to train/evaluate the model, train / dev / test datasets need to have labels for each pixel ( or voxel). Is this even humanly possible to manually label all pixels (or voxels) for thousands of thousands of image samples? Especially for high resolution X-rays/CT-scan when the image size is quite huge already.

In addition, image processing often include a preprocessing step to normalize/scale the picture, including its size. During training, for an input image which has pixel (or voxel) level labels before feeding into model to learn, how would you handle these labels after the picture is rescales? For example, originally you have 1024 * 1024 labels for each pixel, what the labels would be for a rescaled image of 300 * 300 during training?



Hi @MrHuanwang,

I believe the idea was to teach you both, so that you know both of these concepts, how and when to apply them, may it be in the field of medicine or other.

Your second query: I think you have to do it manually or come up with a program which does it precisely. For this reason there are now organisations which solely perform data labelling. As you can imagine, the filed of medicine is unlike others, over here the prediction should have a high percentage, which low chance of error, which is why I believe the labelling part has to be done by the help of a consultant.

Your last query: I believe this is covered in our Deep Learning Specialisation. If you haven’t had a chance to take it, please do so.