My Unet model gives wrong Mask predictions

Hello,
After completing the convolutional Neural Network course from deep learning specialization, I tried to build my own Unet model to segment the area affected by breast cancer.
This is my first ever model.

As you can see in photo the mask prediction does not resemble to the true mask.
I don’t know what’s wrong, I used the code from the course Programming assignment and i changed the dimension a bit to match the size of the images in my dataset.
Ps: if you need to see parts of my code i will be happy to share.

1 Like

Successfully training a model on that complex a task is going to take a lot of labeled data and a lot of training iterations. Where did you get your training dataset? What does a graph of the training accuracy and test accuracy look like as you run the training?

hi this is the dataset that i downloaded:

i trained the model with these parameters:
EPOCHS = 10
VAL_SUBSPLITS = 5
BUFFER_SIZE = 500
BATCH_SIZE = 32
this is the accuracy that i get :
(TensorSpec(shape=(256, 256, 3), dtype=tf.float32, name=None), TensorSpec(shape=(256, 256, 1), dtype=tf.float32, name=None))
Epoch 1/10
25/25 ━━━━━━━━━━━━━━━━━━━━ 450s 17s/step - accuracy: 0.7212 - loss: 1.2305
Epoch 2/10

25/25 ━━━━━━━━━━━━━━━━━━━━ 456s 18s/step - accuracy: 0.9095 - loss: 0.3257
Epoch 3/10
25/25 ━━━━━━━━━━━━━━━━━━━━ 469s 19s/step - accuracy: 0.9090 - loss: 0.3012
Epoch 4/10
25/25 ━━━━━━━━━━━━━━━━━━━━ 450s 18s/step - accuracy: 0.8641 - loss: 0.7718
Epoch 5/10
25/25 ━━━━━━━━━━━━━━━━━━━━ 454s 18s/step - accuracy: 0.9054 - loss: 0.3155
Epoch 6/10
25/25 ━━━━━━━━━━━━━━━━━━━━ 457s 18s/step - accuracy: 0.9125 - loss: 0.2935
Epoch 7/10
25/25 ━━━━━━━━━━━━━━━━━━━━ 945s 39s/step - accuracy: 0.9117 - loss: 0.3203
Epoch 8/10
25/25 ━━━━━━━━━━━━━━━━━━━━ 906s 36s/step - accuracy: 0.9059 - loss: 0.3032
Epoch 9/10
25/25 ━━━━━━━━━━━━━━━━━━━━ 436s 17s/step - accuracy: 0.9021 - loss: 0.3081
Epoch 10/10
25/25 ━━━━━━━━━━━━━━━━━━━━ 434s 17s/step - accuracy: 0.9066 - loss: 0.3069
and i get this warning:
Your input ran out of data; interrupting training. Make sure that your dataset or generator can generate at least steps_per_epoch * epochs batches. You may need to use the .repeat() function when building your dataset.
self.gen.throw(value)

1 Like

Thanks for the link to the dataset. I have not actually looked at the data, but just read the webpage. It says there are 780 total images. They say that the ground truth labels are normal, benign and malignant. I assume that means at the per pixel level. It would be important to understand that.

Just on general principles, it doesn’t seem likely that you could get to 90% training accuracy in 3 epochs. But maybe this is a question for U-Net in general: if you have 500 x 500 pixel images and you are labelling every single pixel, what does 90% accuracy mean? I assume that 90% of the pixels are correctly labelled, but in your predicted mask the only color that is duplicated from the true mask is the few purple pixels. It’s not clear how that could add up to 90%. Are you sure you are interpreting the output correctly? E.g. the model in the assignment does not include the softmax so the predictions are raw logits. You have to convert those to softmax outputs before you do the rendering. Are you sure that in your code, you followed their use of create_mask to convert the predictions?

this is the way i built the model


def conv_block(inputs=None, n_filters=32, dropout_prob=0, max_pooling=True):
    

    
    conv = Conv2D(n_filters, 
                  3,    
                  activation='relu',
                  padding='same',
                  kernel_initializer='he_normal')(inputs)
    conv = Conv2D(n_filters, 
                  3,   
                  activation='relu',
                  padding='same',
                  
                  kernel_initializer='he_normal')(conv)
    
    
    
    if dropout_prob > 0:
        
        conv = Dropout(dropout_prob)(conv)
        
         
        
   
    if max_pooling:
        
        next_layer = MaxPooling2D(pool_size=(2, 2))(conv)
        
        
    else:
        next_layer = conv
        
    skip_connection = conv
    
    return next_layer, skip_connection
def upsampling_block(expansive_input, contractive_input, n_filters=32):
    
    
   
    up = Conv2DTranspose(
                 n_filters,    
                 (3,3),    
                 strides=(2,2),
                 padding='same')(expansive_input)
    
    
    merge = concatenate([up, contractive_input], axis=3)
    conv = Conv2D(n_filters,   
                 3,    
                 activation='relu',
                 padding='same',
                 kernel_initializer='he_normal')(merge)
    conv = Conv2D(n_filters, 
                 3,   
                 activation='relu',
                 padding='same',
                 
                 kernel_initializer='he_normal')(conv)
    
    
    return conv
def unet_model(input_size=(256, 256, 3), n_filters=32, n_classes=23):
   
    inputs = Input(input_size)
  
    cblock1 = conv_block(inputs, n_filters)
  
    cblock2 = conv_block(cblock1[0], n_filters * 2)
    cblock3 = conv_block(cblock2[0], n_filters * 4)
    cblock4 = conv_block(cblock3[0], n_filters * 8, dropout_prob=0.3)

    cblock5 = conv_block(cblock4[0], n_filters * 16, dropout_prob=0.3, max_pooling=False)
   
    ublock6 = upsampling_block(cblock5[0], cblock4[1], n_filters * 8)
    
    ublock7 = upsampling_block(ublock6, cblock3[1], n_filters * 4)
    ublock8 = upsampling_block(ublock7, cblock2[1], n_filters * 2)
    ublock9 = upsampling_block(ublock8, cblock1[1], n_filters)
 

    conv9 = Conv2D(n_filters,
                   3,
                   activation='relu',
                   padding='same',
                   kernel_initializer='he_normal')(ublock9)

    conv10 = Conv2D(n_classes, 1, padding='same')(conv9)
 
    
    model = tf.keras.Model(inputs=inputs, outputs=conv10)

    return model
1 Like

Is n_classes = 23 the right value for your input dataset? Please look at the sample data to make sure you understand what the labels look like.

I’m not saying that’s the issue here, but the higher level point is you can’t just β€œplug and play” here. You have to really understand what is different about your new case.

You also did not comment on my question about create_mask. The output of the model is raw logit values for every pixel, right? So how do you deal with that? In that case, they did give you a worked example in the notebook.

1 Like

Just as an example of what I mean by saying that it’s important to start by understanding your data, I took a quick look on the Kaggle website by clicking into the benign sudirectory. Right away, you notice that for the image benign (100), there are two β€œground truth” mask files. So you need to take the union of those somehow.

My first step would be to dump the contents of some of the mask files of all three types to make sure we understand what the label values are.

Hi, sorry for my late response. The model worked well after i changed the number of classes to 3 and added a softmax function to the create_mask function. I really appreciate your help. Thank you.

1 Like

That’s great news that you got things to work with just those two changes. Thanks for confirming.

It would be interesting to see some samples of what your predicted masks look like after the fixes.

Hello, here is some samples of the predicted masks

2 Likes

The predicted masks look great. Congrats!

Did you have to take any special action to account for the fact that they seem to have multiple mask files for some of the input images? Or do you think that am I misinterpreting what I am seeing on the website, e.g,. for β€œbenign (100)”?

It does look like your training is working well based on the samples you showed, so maybe it doesn’t matter.