Image segmentation with U-Net: images.shapes and origin

We have both data files files : cameraRGB and CameraMask

cameraRGB files can be easily displayed but cameraMask are black when displayed

Questions:

  • What areCameraMask png files exactly and why are they black when we display the png file ?
  • How are CameraMask png files created in the first place ?

For info, here is the process that works:

In a first step, the png files are read with imagio.imread()

  • It returns a numpy array for both img and mask files

In a 2nd step, when we check the shape

  • img.shape → (480, 640, 4)
  • mask.shape → (480, 640, 4)

In a 3rd step, when we display the image:

  • imshow(img) → this display the picture
  • imshow(mask[:, :, 0] → this display the segmentation

I don’t get how we can produce the CameraMaskpPng files in the first place and why we display it with nC = 0

Thanks for your help

The mask is comprised of a single value, the predicted class type, for each pixel location. It is extracted from the forward propagation output by selecting only the class of the highest probability prediction for each pixel (using argmax() )

You’re basically creating a multidimensional object in the shape (height and width) of the input image that contains just the encoding for the predicted class.

1 Like

Right! The masks are images with only one value per pixel and it is on the 0 channel. One other slightly subtle thing to point out is that we are dealing with PNG files here, not the usual RGB files. In PNG, one of the options is to express images with 4 channels per pixel: R, B, G and A (Alpha). Alpha is used in some graphics applications, but all the images here have the A value as 255. It looks like imshow is sophisticated enough to render the 4 channel camera images without “slicing” them to eliminate that 4th channel. But with the mask images, “not so much” :nerd_face: …

1 Like