While reviewing the code for the transformation of the image data, the following code snippet caught my attention, i.e.
def preprocessing_fn(inputs):
[...]
# Convert the raw image and labels to a float array
with tf.device("/cpu:0"):
outputs = {
_transformed_name(_IMAGE_KEY):
tf.map_fn(
_image_parser,
**tf.squeeze(inputs[_IMAGE_KEY], axis=1)**,
dtype=tf.float32),
I’d like to know what the reason for squeezing the 1st tensor axis is for this particular type of data (i.e. binary images)?
Image is not binary. Each pixel value is in the range [0, 255] before transformation.
Data is read in batches. We need the underlying element and hence the need for squeeze.
Consider a sample record found in the training dataset: