The SubsetWithTransform class has been wrapped around train_dataset, val_dataset and test_dataset. These subsets were created from splitting dataset_transformed which is a dataset on which baseline transforms have already been applied. As such, we are applying the baseline transforms again on each of these subsets. This throws an error when you try to access an item with for ex: test_dataset[0]. This throws an error ‘TypeError: pic should be PIL Image or ndarray. Got <class ‘torch.Tensor’>’ as we are trying to apply ToTensor to an object that is already a Tensor from its previous transform.
Hi, thanks for raising this and welcome to the community. Just checked out the lab again and you are actually right. With this code, I see three options:
- Keeping the base dataset raw and then moving all transforms into the wrappers.
- Keeping the
dataset_transformedas-is, but not including ToTensor() (or any PIL-only ops) in the wrapper transforms; i.e. only using tensor-safe transforms (e.g., just Normalize if needed). - Making
SubsetWithTransformskipToTensorif it receives a tensor:
But I think the most important thing is to note that the notebook is for learning purposes. It’s good that it can stimulate discussions like this.
This particular issue can easily be handled in a production setting.
1 Like
Thanks for your response on how this can be resolved. I was also thinking of keeping the base dataset raw.
Also I understand the notebook is for learning purpose. Hence I wanted to point out this issue so others might be aware of it.
1 Like