Tensor dimension error when training on MNIST dataset

Hey guys,

I am trying to implement the course code on MNIST dataset, i.e., I want to first train on MNIST dataset and then add context embedding such that the model can generate some “hybrid” numbers. However, I got the following error code during the first stage (e.g., training on MNIST dataset):

“RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 4 but got size 7 for tensor number 1 in the list.”

Here are my notebook and the utility package:

Untitled.ipynb (67.9 KB)
diffusion_utilities.py (9.7 KB)

I am guessing it is because the original dataset contains RBG images while the MNIST dataset only has grayscale images. So, the ContextUnet class has incompatible in_channels and out_channels, e.g., both the in_channels and out_channels have the value of 3 in the lecture codes.

So, I made the following changes:

  1. change the in_channels to 1 in the nn_model as follows [Line 116 in the uploaded notebook]:
    nn_model = ContextUnet(in_channels=1, n_feat=n_feat, n_cfeat=n_cfeat, height=height).to(device)

  2. change the out_channels to 1 in the last convolutional layers in ContextUnet as follows [Line113 in the uploaded notebook]:

self.out = nn.Sequential(
            nn.Conv2d(2 * n_feat, n_feat,3, 1, 1), # reduce number of feature maps   #in_channels, out_channels, kernel_size, stride=1, padding=0
            nn.GroupNorm(8, n_feat), # normalize
            nn.Conv2d(n_feat, self.in_channels, 1, 1, 1), # map to same number of channels as input

However, I still got a similar error code… I know I must mess up at some place

Just figured out that I forget to change the hyperparamter, height, from 16 to 28…LOL

No need to modify anything in the ContextUNet

1 Like