Have a more careful look at the files. Both the image files and the mask files are PNG files with 4 channels. In PNG format, you can either have 3 channel (RGB) images or 4 channel (RGBA) images where the A is “alpha” which is used by some more sophisticated graphics techniques. Our images here all have A = 0. In the case of the mask files, the object label for the pixel is the value on channel 0 (R) and the other channels are all 0. You can see how they handle that in the cell that shows sample image and mask files early in the notebook. Add some logic there to print the shapes and print out some sample channel values to see what is going on.