U-Net Crop function


I’m struggling with the implementation of crop function. I used the CenterCrop transform with the (new_shape[2], new_shape[3] → WxH) input and I apply it to the image input. But I get the error message ‘builtin_function_or_method’. Normally we can apply to a batch of tensor images.
The rest of my notebook seems correct and this blocks me.
Could you help me please ?

Thanks in advance

1 Like

Hi Math16!
I hope that you are doing well.
Well, it’s quite strange. I tried what you did in both Coursera lab and in my local jupyter notebook(same file). It works in my local notebook but not in the Coursera lab. So I had a doubt about whether the version is causing the problem or not. So I checked the Torchvision version both in Coursera lab and in my local jupyter notebook.
And it turned out to be this::
Coursera Lab’s Torchvision version: 0.5.0
My Torchvision’s version:: 0.13.0. So to check whether it made a difference, I referred to both the documentation ( 0.13.0 and 0.5.0 ). We can find the documentation for 0.13.0 directly but not 0.5.0. Yet, we can find it in the corresponding PyTorch version (1.14.0).

So this is the difference between the two :



Basically in 0.5.0 CenterCrop only accepts PIL image as the input (our test functions pass tensor, not PIL image) whereas 0.13.0 can accept both Tensors and PIL images as input. So even though you are correct, please do the cropping by simply slicing the image (slicing by index) to complete the assignment.

Have a great day!

HI Nithin

Could you guide how to slice the image by index ?

I tried
cropped_image = transforms.CenterCrop(new_shape)(transforms.ToPILImage()(image))

but got error of
pic should be 2/3 dimensional. Got 4 dimensions.

Hi mc04xhf!
Sorry I cannot give you the code, only you have to figure it out :disguised_face: . But still as said I will help you to understand the method of slicing by giving you the general syntax.

cropped_image = image[:, :, start_row:end_row, start_col:end_col]

These images are of four dimensions (batch size, channels, height, width). So to crop the images, basically I have to modify the height and width and I want to replicate it over all channels and all images in that batch or set. To do so, I will take all the images in the batch so I’m just giving a colon “:” in the first index which means to take all the images, similarly for channels. Then I’m taking only the region of the image which I want by slicing the height and width. If it is [:,:,2:5,2:5] then basically we make cropped images by taking only 3 rows and 3 columns of the original image matrix (2 to 4). The syntax entirely depends on the dimensions of the image. Say if I have only 3 dimensions and that too is of the style (height,width,channels) -->only one image, then the syntax would be
cropped_image = image[start_row:end_row, start_col:end_col,:].
So you have to decide according to the dimensions and shape.

Also, this is the mistake you are doing in your approach :
the input image you are trying to crop using transforms.CenterCrop has four dimensions, whereas transforms.CenterCrop expects an input image with two or three dimensions.

Hope this helps!
Have a great day.

1 Like

Thanks Nithin.

I finally figured it out with your help!

You are welcome. Good to know :blush:. Enjoy Learning!