Need advice on reshaping the layer for a transformer Tensorflow model

Hi.
I am applying a pre-trained google model, TFViTModel to my image dataset. It is a binary classification problem e.g. is the image a cat or not?

here is a snapshot of my code:

# Flipping and rotating images
data_augmentation = keras.Sequential(
    [layers.RandomFlip("horizontal"), layers.RandomRotation(0.1),]
)
# Apply data augmentation
inputs = keras.Input(shape = train_images.shape[1:])
x = data_augmentation(inputs)  

# Importing the base model
base_model = TFViTModel.from_pretrained('google/vit-base-patch16-224-in21k')

# Defining the layers

inputs = keras.Input(shape = train_images.shape[1:])
x = data_augmentation(inputs)

inputs.shape # or x.shape
out: TensorShape([None, 224, 224, 3])

x = base_model(x, training=False)
outputs = tf.keras.layers.Dense(1, activation=‘sigmoid’)(x)

My error is

ValueError Traceback (most recent call last)
~\AppData\Local\Temp\ipykernel_28616\1966532.py in <cell line: 2>()
1 # The model
----> 2 x = base_model(x, training=False)
3

ValueError: Input 0 of layer projection is incompatible with the layer: expected axis -1 of input shape to have value 3 but received input with shape (None, 224, 3, 224)


I have tried to  force the shape for the Input layer like this:

inputs = keras.Input(shape = (3, 224, 224))
inputs.shape
out: TensorShape([None, 3, 224, 224])

and got another ValueError:

ValueError: Layer dense_3 expects 1 input(s), but it received 2 input tensors. Inputs received: [<tf.Tensor ‘Placeholder:0’ shape=(None, 197, 768) dtype=float32>, <tf.Tensor ‘Placeholder_1:0’ shape=(None, 768) dtype=float32>]

Can somebody give advice about how to reshape it to the form required?  
Thank you.

P.S. I am following this guide, https://www.philschmid.de/image-classification-huggingface-transformers-keras

You can’t pass the image array directly to the model. It’s important to pass the image through the feature extractor which in turn generates the inputs for the model.
The model takes parameters matching certain format. Please see this link.

If you need help with moving dimensions around, see the Permute layer.