The task is to segment vertebrae, the spinal canal, and disks from spine MRI scans. The dataset poses two main challenges: it is relatively small, and it has been collected from different hospitals, resulting in varying sizes, aspect ratios, and resolutions. To address these issues, I used adaptive max pooling to standardize the image sizes. Normalization was applied only to the images. The masks consisted of three channels, each having a value of 1 where segmentation is needed and 0 otherwise.
I implemented a simple U-Net model capable of segmenting the three channels as expected. The model’s input and output shapes were ([4, 1, 16, 256, 256]) and ([4, 3, 16, 256, 256]), respectively. The loss function used was CrossEntropyLoss(). During training, the loss was decreasing, indicating that everything was all right. However, during testing, I observed that all images contained noise, as shown in the attached picture. I am not sure what the exact problem is or why the mask is not binary like the ground truth.
Any help would be greatly appreciated.
Thanks in advance.