How to build a hybrid model with the LTSM and convolutional layer? ValueError: Input 0 of layer "conv1d_1" is incompatible with the layer

bluetail · May 16, 2022, 10:45am

I have this error with my model architecture for a sentiment analysis problem (binary classification).
It is a text corpus with an average review of review is 373 words - so each review consists of several lengthy sentences, and the model with the two LSTM layers is overfitting to the data failing to steadily decrease the validation loss.

After reading academic articles, I discovered that adding a 1D Convolutional layer in combination with a pooling layer can help mitigate the problem by selecting the most important features (Basiri et al., 2021; Xu et al., 2021).
So I am trying to implement this suggestion.

so my code is


# Hyperparameters
EMBEDDING_DIM = 50
MAXLEN = 500 #1000, 1400
VOCAB_SIZE =  33713

DENSE1_DIM = 64
DENSE2_DIM = 32

LSTM1_DIM = 32 
LSTM2_DIM = 16

WD = 0.001

FILTERS = 64  
KERNEL_SIZE = 5

# Model Definition 
model_lstm = tf.keras.Sequential([
    tf.keras.layers.Embedding(VOCAB_SIZE+1, EMBEDDING_DIM, input_length=MAXLEN,weights=[EMBEDDINGS_MATRIX], trainable=False),
    tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(LSTM1_DIM, dropout=0.5, kernel_regularizer = regularizers.l2(WD), return_sequences=True)), 
    tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(LSTM2_DIM, dropout=0.5, kernel_regularizer = regularizers.l2(WD))),
    tf.keras.layers.Dense(DENSE2_DIM, activation='relu'),
    tf.keras.layers.Conv1D(FILTERS, KERNEL_SIZE, activation='relu'),
    tf.keras.layers.Dropout(0.1),
    tf.keras.layers.GlobalAveragePooling1D(), 
    tf.keras.layers.Dense(1, activation='sigmoid')
])

# Set the training parameters
model_lstm.compile(loss='binary_crossentropy',
                   optimizer=tf.keras.optimizers.Adam(), 
                   metrics=[tf.keras.metrics.BinaryAccuracy()])

# Print the model summary
model_lstm.summary()


num_epochs = 35
history_lstm = model_lstm.fit(sent_tok_train, labels_train, epochs=num_epochs, validation_data=(sent_tok_val, labels_val), verbose =2)

....

File ~\.conda\envs\tf-gpu\lib\site-packages\keras\engine\input_spec.py:228, in assert_input_compatibility(input_spec, inputs, layer_name)
    226   ndim = x.shape.rank
    227   if ndim is not None and ndim < spec.min_ndim:
--> 228     raise ValueError(f'Input {input_index} of layer "{layer_name}" '
    229                      'is incompatible with the layer: '
    230                      f'expected min_ndim={spec.min_ndim}, '
    231                      f'found ndim={ndim}. '
    232                      f'Full shape received: {tuple(shape)}')
    233 # Check dtype.
    234 if spec.dtype is not None:

ValueError: Input 0 of layer "conv1d_1" is incompatible with the layer: expected min_ndim=3, found ndim=2. Full shape received: (None, 32)

how can I fix this error please? thank you.

balaji.ambresh · May 16, 2022, 12:30pm

Please read these pages keeping input shapes in mind:

Consider Conv1D for instance. It takes input of 3 dimensions. Looking at your architecture, there are only 2 dimensions as inputs.

Input 0 of layer "conv1d_7" is incompatible with the layer: expected min_ndim=3, found ndim=2. Full shape received: (None, 32).

bluetail · May 16, 2022, 1:11pm

yes thank you. that is what my question is about - I do not understand what I should be changing in the code to get expected min_ndim=3 instead of ndim=2. Do I need to I use the input_shape= argument to set the dimentions?

balaji.ambresh · May 16, 2022, 3:59pm

@bluetail How about you print(model.summary()) as you add layers? That way, the input and output shape of each layer will become clear.

MayankGhogale · May 16, 2022, 5:09pm

The conv1d input shape takes in 3 parameters while you are supplying 2 parameters in the shape
You can use a simple trick of reshaping the input to conv1d layer as (x,y,1) where x and y are the original dimensions and 1 just adds the 3rd dimension without altering the total number of elements in the array
Hope that helps

bluetail · May 21, 2022, 8:03pm

I have summary like this before my Conv1D layer.

Model: "sequential_3"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 embedding_5 (Embedding)     (None, 500, 50)           1685700   
                                                                 
 bidirectional_9 (Bidirectio  (None, 500, 64)          21248     
 nal)                                                            
                                                                 
 bidirectional_10 (Bidirecti  (None, 32)               10368     
 onal)                                                           
                                                                 
 dense_6 (Dense)             (None, 32)                1056      
                                                                 
=================================================================
Total params: 1,718,372
Trainable params: 32,672
Non-trainable params: 1,685,700
____________________________________

can I try anything as input_shape ? can I try input_shape = (None, 16, 128) or input_shape = (16, 64, 1), for example?
is it just the problem of parameter tuning from there to get a better fit?
thank you very much.

balaji.ambresh · May 22, 2022, 5:33am

input_shape should reflect the shape of actual data that’s fed into the model. Don’t use random values for this.

bluetail · May 22, 2022, 9:55am

Can you please explain more about this? I have access to Andrew NG Coursera courses and also to the resources of the University of Edinburgh. if you could refer to a course or a book about this.
I still do not know how to progress from (None, 32) to a 3D (…, …, …) input for the Conv1D layer.
thank you very much.

balaji.ambresh · May 22, 2022, 10:52am

No worries.

Please follow these steps:

Create an input layer.
Add an embedding layer.
Create a model and observe the output shape keeping in mind the number of dimensions.

Can you continue to expand this model beyond these 2 layers by adding a Conv1D layer?

bluetail · May 23, 2022, 5:44pm

Do you know why in my example, I get the output of 3 dimensions after my first bidirectional layer, but then my second bidirectional layer outputs 2 dimentions?

thank you.

balaji.ambresh · May 23, 2022, 6:01pm

It’s because you don’t have return_sequences=True in the LSTM layer inside the 2nd bidirectional layer.

Topic		Replies	Views
ValueError: Input 0 of layer "lstm_3" is incompatible with the layer: expected ndim=3, found ndim=2 AI Discussions	1	318	May 18, 2023
Week 3, Weekly Assignment, Unable to prevent overfitting of the model Natural Language Processing in TensorFlow week-3	7	256	August 23, 2023
Week 3 Assignment Help Natural Language Processing in TensorFlow	1	402	July 14, 2022
Exercise 2 - convolutional_model - Error Convolutional Neural Networks	2	705	August 22, 2021
Week 1 Convolution model application issue Convolutional Neural Networks	9	600	October 2, 2021

How to build a hybrid model with the LTSM and convolutional layer? ValueError: Input 0 of layer "conv1d_1" is incompatible with the layer

Related topics