I have this error with my model architecture for a sentiment analysis problem (binary classification).
It is a text corpus with an average review of review is 373 words - so each review consists of several lengthy sentences, and the model with the two LSTM layers is overfitting to the data failing to steadily decrease the validation loss.
After reading academic articles, I discovered that adding a 1D Convolutional layer in combination with a pooling layer can help mitigate the problem by selecting the most important features (Basiri et al., 2021; Xu et al., 2021).
So I am trying to implement this suggestion.
so my code is
# Hyperparameters
EMBEDDING_DIM = 50
MAXLEN = 500 #1000, 1400
VOCAB_SIZE = 33713
DENSE1_DIM = 64
DENSE2_DIM = 32
LSTM1_DIM = 32
LSTM2_DIM = 16
WD = 0.001
FILTERS = 64
KERNEL_SIZE = 5
# Model Definition
model_lstm = tf.keras.Sequential([
tf.keras.layers.Embedding(VOCAB_SIZE+1, EMBEDDING_DIM, input_length=MAXLEN,weights=[EMBEDDINGS_MATRIX], trainable=False),
tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(LSTM1_DIM, dropout=0.5, kernel_regularizer = regularizers.l2(WD), return_sequences=True)),
tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(LSTM2_DIM, dropout=0.5, kernel_regularizer = regularizers.l2(WD))),
tf.keras.layers.Dense(DENSE2_DIM, activation='relu'),
tf.keras.layers.Conv1D(FILTERS, KERNEL_SIZE, activation='relu'),
tf.keras.layers.Dropout(0.1),
tf.keras.layers.GlobalAveragePooling1D(),
tf.keras.layers.Dense(1, activation='sigmoid')
])
# Set the training parameters
model_lstm.compile(loss='binary_crossentropy',
optimizer=tf.keras.optimizers.Adam(),
metrics=[tf.keras.metrics.BinaryAccuracy()])
# Print the model summary
model_lstm.summary()
num_epochs = 35
history_lstm = model_lstm.fit(sent_tok_train, labels_train, epochs=num_epochs, validation_data=(sent_tok_val, labels_val), verbose =2)
....
File ~\.conda\envs\tf-gpu\lib\site-packages\keras\engine\input_spec.py:228, in assert_input_compatibility(input_spec, inputs, layer_name)
226 ndim = x.shape.rank
227 if ndim is not None and ndim < spec.min_ndim:
--> 228 raise ValueError(f'Input {input_index} of layer "{layer_name}" '
229 'is incompatible with the layer: '
230 f'expected min_ndim={spec.min_ndim}, '
231 f'found ndim={ndim}. '
232 f'Full shape received: {tuple(shape)}')
233 # Check dtype.
234 if spec.dtype is not None:
ValueError: Input 0 of layer "conv1d_1" is incompatible with the layer: expected min_ndim=3, found ndim=2. Full shape received: (None, 32)
how can I fix this error please? thank you.