C1_m3_lab duplicate transforms?

if I add print(“This is the first transform”) to the __getitem_ method of FlowerDataset class. and add print(‘this is the second transform’) to the getitem method of SubsetWithTransform class and run a cell like following:

train_dataloader = DataLoader(train_dataset, batch_size=1)
for x, y in train_dataloader:
break

(added to a created cell before Robust Datasets cell)

I will get an error. also I will get both prints executed. So my question is: is the transform duplicated? meaning we have applied transform twice? I think the code will fail too because of ToTensor as I get the following error: TypeError: pic should be PIL Image or ndarray. Got <class ‘torch.Tensor’>

I am taking a look at your question now. But as I am refreshing my memory of this notebook, there are a couple of things to note:

This is one of the “Code Example” notebooks that they give us here. It is not a graded assignment. We’ll see this pattern in most modules in all the PyTorch courses. The code is complete as given to us and it works, meaning there is no change we need to make. The point is they are showing us examples of how these various torch constructs work and how to use them. So we read through everything, understand the code, running each section as we go and watch what it does. So any errors that are happening are because of some change that you made to the code. It’s fine to try experiments to test your understanding, but it’s not fair to blame any errors on the course material. Now you need to debug your changes and understand why they don’t work.

You can get a clean copy and start over again by just deleting the file and then doing a browser refresh. Or if you’re going to do experiments, maybe the better strategy is to first create a separate copy of the notebook for your experiments preserving the original for comparison. You can do that by using File -> Duplicate Notebook.

Ok, now I will investigate your actual question about the layering of the transforms.

I understand that the notebook code may work as provided, but I believe it is completely natural for a learner to question even working code in order to understand it properly. For me, learning does not mean simply accepting code as-is, but trying to understand how and why it works.

I was not blaming the course material, your code, or anyone else. I asked my question because I had doubts about my own understanding and wanted to verify it. I never said the code was wrong or that there was any issue with the original material. I was only asking whether my interpretation was correct.

If my wording suggested otherwise, that was not my intention. I asked the question in good faith as part of the learning process. If you feel I was assigning blame, then I apologize, because that was not what I meant.

1 Like

Thanks for clarifying your intention here. Sorry if I jumped to the wrong conclusion. It was the way you phrased the above quote that gave me the impression you were saying something was broken about the notebook.

Let me try to answer your actual question now, but it will take me a few more minutes.

Update: actually it is dinner time in my timezone, so it may actually be a few hours before I can get back to this question. Sorry for the delay.

No worries at all. Thank you for taking the time to look into my question. I appreciate it.

I’ve looked at the code and run some experiments and I do not see any cases in which the transforms are applied twice on the same dataset. The problem may be that the OOP (Object Oriented Programming) is getting pretty thick here. Note that we define several different classes, e.g. FlowerDataset, SubsetWithTransform and RobustDataset, but the thing to remember is that first you have to instantiate an instance of one of those objects. It’s also key to recognize that in all of those cases the transform argument is a “keyword parameter” with the default value of None. Of course that means that it is optional. For example, here’s how I added my version of your print statement in the first case of the definition of the FlowerDataset class:

        # Check if a transform is provided.
        if self.transform is not None:
            # Apply the transform to the image.
            print(f"Apply transform in FlowerDataset")
            image = self.transform(image)

The obvious point being that it only prints the message when there actually is a defined transform. Note that the first time we instantiate that class, it does not include a transform:

# Initialize the dataset object, providing the path to the data.
dataset = FlowerDataset(path_dataset)

Maybe the first thing to check is that you didn’t write that print statement outside the control of the relevant “if” statement there, which would give misleading results. Although I still doubt there are any cases of “nested” calls to __getitem__ for the different classes.

If what I’m saying here doesn’t seem to make sense to you or explain anything, then maybe the problem is that you’ve already rewritten some critical parts of the given code. So my first suggestion would be to start over from scratch and really look carefully again at the original code. You can save your work by downloading your current version. Then do the trick of deleting or renaming the notebook with your changes and then do a browser “refresh” and that should give you the original “as written” version of the notebook.

Regards,
Paul

Thanks for the explanations. I wanted to see what the original state of the file was, so I deleted the file ( I have downloaded the modified version for my own reference). After refreshing the browser nothing happens. Also, the “Restore Original Version” does not work either; it just does nothing when I click it. I used the instruction provided in this link

Hello @Sina_Tutunchi,

Do you see the “Original Version” status at the top left corner of the notebook?

I click save and I see the following error:

This is the first time I see this error, but I have also never clicked the “Save” button without having the notebook opened. Also, it’s funny that I can’t see the “Save” button now on my lab interface and thus I can’t try to reproduce the error.

Can you open the notebook first, then click “Save”, then try “Restore Original Version”?

Yes it worked. This is what I did:

  1. I created an empty notebook and renamed it to the name of the original notebook

  2. save button worked and I could restore original version.

Thanks for the help

1 Like

Thanks for sharing your solution.

@paulinpaloalto Thanks for your explanations earlier. I reset the notebook and I looked at the code. Please correct me if I am wrong:

  1. train_dataset, val_dataset, test_dataset = split_dataset(dataset_transformed). train_dataset is created from dataset_transformed

  2. dataset_transformed = FlowerDataset(path_dataset, transform=transform). So dataset_transformed has already a transform.

  3. train_dataset = SubsetWithTransform(train_dataset, transform=augmentation_transform) introduces second transform

Again, thanks for you time.

Hello, @Sina_Tutunchi, and @paulinpaloalto,

The original version of the notebook won’t show an error except when we add a new code cell that gets an item from the SubsetWithTransform instance that modifies data at path_dataset through the following pipeline:

Obviously, each image file is processed by both transform and augmentation_transform which is the combination of the following three lists of transformations:

resulting in the following error:

due to that we are running ToTensor twice (in the first and the third list) and the second one failed because it’s already a Tensor converted by the first one.

3 Likes

Raymond to the rescue! Sorry, my analysis was too shallow to catch those points.