I have some confusion and hope someone can help me to figure out:

• Why do we choose numbers (0, 48) from a uniform distribution to find a random number?
• Why do we divide `[x|y]max` and `min` by 75?
• Can we divide `xmin` and `ymin` by 75 at line 7 and 8, after an image’s normalization step?
``````def read_image_tfds(image, label):
xmin = tf.random.uniform((), 0 , 48, dtype=tf.int32)
ymin = tf.random.uniform((), 0 , 48, dtype=tf.int32)
image = tf.reshape(image, (28,28,1,))
image = tf.image.pad_to_bounding_box(image, ymin, xmin, 75, 75)
image = tf.cast(image, tf.float32)/255.0
xmin = tf.cast(xmin, tf.float32)
ymin = tf.cast(ymin, tf.float32)

xmax = (xmin + 28) / 75
ymax = (ymin + 28) / 75
xmin = xmin / 75
ymin = ymin / 75
return image, (tf.one_hot(label, 10), [xmin, ymin, xmax, ymax])
``````

I’d appreciate your help. Many thanks

1 Like

Hi @jackliu333,

Thanks a lot for your response; I’d appreciate it.

I can get the idea of why we choose 0 and 48. We add 47 to each x and y, and since `tf.random.uniform` accepts the distribution in the range `[minval, maxval)`, the `maxval` should be 48, and then we can get the max 47.

Regarding the decision by 75, I still don’t get it totally. As far as I understand, each pixel’s value stays in range `(0, 254)`, so we divide it by `255` to normalize the value, and we apply this division to all pixels despite an image’s width and height.

Here, `[x|y][min|max]` is simply a coordinate of one random point on the XY coordinate system:

1. The reason we normalize them is that it will be easier to feed NN later.
2. And the range of the new canvas’ X-axis and Y-axis is `(0,74)`, we will divide them by `75`. If there is another image with a different size like `(16, 32)`, we will divide x by `32` and y by `16`.

Am I correct?

In addition, I would like to ask why we need to subtract an image by 1 (line 11), after the normalization step. Is it a compulsory or optional step?

Thanks again

I think your understanding is correct.

On the minus 1 operation, it follows the standard transformation that consists of scaling + centering: first we scale the image by dividing against 127.5, then center the image by minusing 1. The resulting range would be different from say, just dividing by 255, which is more often.

1 Like