Training model to identify digits - issue with incompatible shape

David_Pruitt · September 9, 2022, 10:42pm

Hey all,

I’ve been working on training a simple model to identify digits - similar to the week 2 lab. However, I am using the MNIST dataset that can be imported with tensorflow.keras.datasets:

from tensorflow.keras.datasets import mnist
(X_train, Y_train), (X_test, Y_test) = mnist.load_data()

The resulting X_train dataset has a shape of (60000, 28, 28) because they are 28x28 pixel images. I then “un-roll” the images to be 1d arrays, so the new shape is (60000, 768). I do the same “un-rolling” procedure with the X_test testing data.

After doing this, I specifiy my model, compile it, and fit the training data. All of this seems to go fine.

However, when I try to predict a digit from the test set, like so:

prediction_p = model.predict(X_test2[0])

I run into the following issue:

WARNING:tensorflow:Model was constructed with shape (32, 784) for input KerasTensor(type_spec=TensorSpec(shape=(32, 784), dtype=tf.uint8, name=‘dense_18_input’), name=‘dense_18_input’, description=“created by layer ‘dense_18_input’”), but it was called on an input with incompatible shape (None,).

It then proceeds to give me a further error.

As far as I can tell, all of the shapes of my training and testing data are what they should be, so I’m not sure what to do to fix this issue.

Here is my complete code so that anyone can reproduce the error I am receiving:

import math
import numpy
import matplotlib.pyplot as plot
import tensorflow as tf

from tensorflow.keras.datasets import mnist
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.activations import linear, relu, sigmoid

from itertools import chain


#Load the dataset
(X_train, Y_train), (X_test, Y_test) = mnist.load_data()

#Show a few of the digits, just for the sake of a sanity check
for i in range(9):
    plot.subplot(330+1+i)
    plot.imshow(X_train[i], cmap=plot.get_cmap("gray"))
plot.show()

#Unroll the 2d arrays into 1d arrays
X_train_temp = []
for i in range(0, 60000):
    X_train_temp.append(numpy.array(list(chain.from_iterable(X_train[i]))))
X_train2 = numpy.array(X_train_temp)

Y_train2 = numpy.array([Y_train])
Y_train2 = Y_train2.T

X_test_temp = []
for i in range(0, 10000):
    X_test_temp.append(numpy.array(list(chain.from_iterable(X_test[i]))))
X_test2 = numpy.array(X_test_temp)

#Create a neural network model
model = Sequential([
    Dense(25, activation="relu"),
    Dense(15, activation="relu"),
    Dense(10, activation="linear"),
], name = "digit_id_model")

#Specify the loss function and also 
#indicate to use the Adam's optimizer for the learning rate
model.compile(
    loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    optimizer=tf.keras.optimizers.Adam(0.001)
)

#Fit the model
model.fit(X_train2, Y_train2, epochs=40)

#Now let's try making some predictions
for i in range(0, 10):
    prediction_p = model.predict(X_test2[i])
    yhat = numpy.argmax(prediction_p)
    print(f"Label: {Y_test[i]}, Prediction: {yhat}")

Any help would be appreciated! Thanks!

TMosh · September 9, 2022, 11:02pm

So you loaded X_test from Keras, but you’re predicting with X_test2.

Why?

David_Pruitt · September 9, 2022, 11:27pm

Because I reshaped both X_train and X_test, so the reshaped versions are X_train2 and X_test2. I trained on X_train2 and am testing with X_test2.

TMosh · September 9, 2022, 11:46pm

Try printing out the shapes of all of your training and test data sets.

David_Pruitt · September 10, 2022, 12:00am

Already done that. As I said previously, the shapes are as I expect them to be.

rmwkwok · September 10, 2022, 12:06am

Hello David, I meet with a different error but the source of problems should be the same. Here is the error I see:

This is because you provided a dataset of size (60000, 768) for training, but a dataset (even it is just a sample) of size (768, ) for prediction. To pass the prediction, you need to give a dataset of size (1, 768) or (any number of samples, 768). For example, predictions are made on all test samples by,

Screenshot from 2022-09-10 07-57-33

If you prefer to predict on just one sample, you might do this

Screenshot from 2022-09-10 08-04-31

The additional square brackets makes it return a 2D array of shape (1, 768).

Lastly, if I may make a small suggestion, you might “unroll” the images with a single line:

Nice work, keep trying!
Raymond

David_Pruitt · September 10, 2022, 2:07am

Awesome thanks for that input. Since my X_test2 shape was already (10000, 768), I didn’t even think about having to reshape an individual element such as X_test2[0] to be something like X_test2[[0]]. Having done that, it does work.

Thanks!

rmwkwok · September 10, 2022, 2:08am

You are welcome David!

Topic		Replies	Views
Failed test case: first layer of the model has incorrect input shape. Expected: (None, 28, 28), but got: (32, 28, 28) Introduction to TF for Artificial Intelligence ... week-2	14	725	April 19, 2022
Problem with Input Shape in ResNet AI Discussions	11	303	August 10, 2022
Transfer Learning with MobileNet # Week 2 Assignment 2 Convolutional Neural Networks	8	1180	June 14, 2021
Input layer dimension not same as tf dataset Sequences, Time Series and Prediction week-4	3	22	November 4, 2024
Week 2 Fashion MNIST Classifier Browser-based Models with TensorFlow.js week-2	11	354	April 10, 2024

Training model to identify digits - issue with incompatible shape

Related topics