Network's segmentation output is similar to input

I am using a transformer+UNet hybrid to segment tumours from a lung CT scan. As you can see, the network seems to “memorize” the input and produce that as an output. I train the network on some 1600 image.

I have tried changing things like the number of layers, filters, activation function and even optimizers. Now, the output seems to be so much more random than what is pictured below.

Why does this happen? How do I prevent it?

Here are some links to the code:

transUNet functional model

patch generator and patch encoder

Cascaded UpSampler

transformer encoder

Network output

Sample input

Sample output

Are you getting anything in the output that has similarity to the input, do you have a metric to measure the performance of your network, if it is doing well on train set that you should get something at least for the training part!

I am using IoU and dice score as metrics. Dice loss was used in the model. And the weird thing is, it is performing well on both training and test set. But, the model outputs like the image attached in the post.

Is there any chance if that image size was increased could look like the input?

I apologize, but I do not understand what you are trying to say?

I was saying if you zoomed that image but those
are probably pixels. How come when you are training you are doing alright, try an inference on training images.

I am just giving some general ideas to be frank with you. Maybe the loss is not choosen right for this application…thats all I can think of withouit going too much in depth.

1 Like

As you suggested, I performed inference on training images as well. The result is still some random output.

This was the output of the most recent version of the network (This was after changing the loss function to binary cross-entropy. Previously it was dice loss and the output was very much similar to the image attached below).

Edit: Thanks a lot for replying so quick. It means a lot to me :')