C4W2 Assignment: using the MNIST CNN lab

Hey,

I am on the second assignment and thought it would be cool to use the CNN architecture from the C4_W2_Lab_4_FashionMNIST_CNNAutoEncoder lab. The following changes to the code made sense to me:

  • Everywhere where there is a “2D” change to “3D” since the new data is in colour.
  • Change the input dimensions to be (32,32,3), again because images are now colour and have 32x32 dimensions.
  • Change kernel size everywhere to be (3,3,1), because images are 3D.

If I call model.summary() on this I get: ValueError: Input 0 of layer "conv3d" is incompatible with the layer: expected min_ndim=5, found ndim=4. Full shape received: (None, 32, 32, 3)

Which makes me think somewhere the dimensions are going wrong.

I thought maybe the issue was in MaxPooling3D and UpSampling3D, so I changed their pool_size and size parameters to (2,2,1)

This still results in an error, albeit a slightly different one: ValueError: Input 0 of layer "conv3d_2" is incompatible with the layer: expected min_ndim=5, found ndim=4. Full shape received: (None, 32, 32, 3)

So now the error has moved from conv3d to conv3d_2 and I’m not sure how to make use of that information.

If anybody could help me figure out where this is going wrong it would be greatly appreciated!

Those are just the default names assigned to two of the Conv3D layer instances, right? So the clue is the input to the first Conv3D instance seems to have the correct shape, but the input to the conv3D_2 instance does not. So you need to look closer at the output shape from the preceding layer.

FWIW I found it can help debugging and generally understanding network architectures to assign names to layers, especially when there are repeating structures or when the network is deep. For example, if you give an ordinal name, when you visualize the model summary or graph it is easy to find duplicates or cut and paste errors.

A more fundamental question is are you sure you need to do this conversion? Aren’t all the inputs static, independent images ie not video? I think the convolution should still occur in height and width only, not across channels.

2 Likes

Thanks for the tips! Indeed Conv3D is not needed, I see that now. Just some adjustments to input dims and filters. Really appreciate the help @ai_curious !

1 Like