Ground Truth Generation?

I’m working on an AI model for skin lesion segmentation and classification. My dataset (HAM10000) doesn’t include ground truth images to train my U-Net architecture on- my question is, how can I generate these ground truth images?

Sorry if my question isn’t the best- I’m very new to this.

1 Like

The most common method is to find some volunteers (or experts, if you have a lot of money available) who will view the images and tag them for you.

A lot of this kind of work is crowd-sourced.

As Tom says, you generally need humans to do that and semantic segmentation is the most complicated case in that you have to label every pixel in the image, not just recognize whether there is a cat or a kangaroo in the image. And in this case, you’re going to need trained dermatologists to do the segmentation, I would assume, since the key question is which areas look cancerous or pre-cancerous or like other known types of lesions. A random “mechanical turk” labeller will have no clue what the difference is between a seborrheic keratosis and a squamous keratoacanthoma and the difference matters, right?

Have you done any searching for datasets that have labelling for image segmentation as opposed to the HAM10000 dataset?

1 Like