How to extract body of a transformer like models and fine tune with that body on different data

Arjun_Reddy · June 3, 2023, 12:32pm

In BERT like transformer model (I am not talking about BERT in this thread), it has 2 training objectives Masked Language Modeling and Next sentence prediction right? and BERT model is also supports different input shapes, So I am actually building a model with 2 training objectives on a base model and those 2 training objectives are Denoising data on time-series and Triplet loss on time sereis and just want to take the base model body and fine tune the model on different data with different shape in TensorFlow, How is code written for this in tensorflow at low-level and what I mean is extract that body after pre-training and then add new inputs and new outputs to this base model and fine-tune whole model on some dataset with different shape.

Base Model Arcitechture:

inputs=layers.Input((3000,7))
x = layers.Conv1D(32, 3, activation=tf.nn.leaky_relu,padding='same')(inputs)
x = layers.MaxPooling1D(2)(x)
x = layers.MultiHeadAttention(num_heads=4, key_dim=2)(x, x)

x = layers.BatchNormalization()(x)

x = layers.Conv1D(64, 3, activation=tf.nn.leaky_relu,padding='same')(x)
x = layers.MaxPooling1D(2)(x)
x = layers.MultiHeadAttention(num_heads=4, key_dim=2)(x, x)

x = layers.GlobalAveragePooling1D()(x)
x = layers.Dense(512, activation='relu')(x)
x = layers.Dense(256, activation='relu')(x)
outputs = layers.Dense(128, activation='relu')(x)


base_model = keras.Model(inputs=inputs, outputs=x)

Pre-training Base model:

input_denoising = keras.Input(shape=(x_unlabelled.shape[1], x_unlabelled.shape[2]))
x = base_model(input_denoising)
x=layers.Dense(3000*7,activation=tf.nn.leaky_relu)(x)
output_denoising=layers.Reshape((3000, 7))(x)

input_1 = keras.Input(shape=(x_unlabelled.shape[1], x_unlabelled.shape[2]))
input_2 = keras.Input(shape=(x_unlabelled.shape[1], x_unlabelled.shape[2]))
input_3 = keras.Input(shape=(x_unlabelled.shape[1], x_unlabelled.shape[2]))

output_1 = base_model(input_1)
output_2 = base_model(input_2)
output_3 = base_model(input_3)

concatenated = layers.concatenate([output_1, output_2, output_3])
output = layers.Dense(2, activation='softmax')(concatenated)


combined_model=keras.Model(inputs=[input_denoising,input_1, input_2, input_3],outputs=[output_denoising,output])
combined_model.compile(optimizer='adam',loss=['mse','categorical_crossentropy'])

combined_model.summary()

So, now I want to remove all the input and outputs and add new inputs and outputs aligned with new data and fine-tune on that dataset, so how can I do this

canxkoz · September 15, 2023, 7:28pm

Hello @Arjun_Reddy
To fine-tune your base model on a new dataset with different input and output shapes, you can follow these steps:

Extract the base model body.
Create new input and output layers.
Connect the new input and output layers to the base model.
Compile and train the new model.

Here’s an example of how to do this in TensorFlow:

import tensorflow as tf
from tensorflow.keras import layers, models

# Assuming you have already trained the base_model

# Create new input and output layers
new_input_shape = (new_input_length, new_input_channels)
new_output_units = new_num_classes

new_inputs = layers.Input(shape=new_input_shape)
new_outputs = base_model(new_inputs)

# Add new layers as needed
new_outputs = layers.Dense(new_output_units, activation='softmax')(new_outputs)

# Create a new model with the new input and output layers
new_model = models.Model(inputs=new_inputs, outputs=new_outputs)

# Compile the new model
new_model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

# Train the new model on the new dataset
new_model.fit(new_dataset, epochs=epochs, batch_size=batch_size)

In this example, new_input_shape and new_output_units should be replaced with the desired input shape and output units for your new dataset. You can also add more layers between the base model and the new output layer if needed.

Remember that if you want to freeze the base model layers during fine-tuning, you can set the trainable attribute of the base model to False before connecting it to the new input and output layers:

base_model.trainable = False

If you have followup questions please feel free to reply and I will try my best to so answer them.
Regards,
Can Koz

Topic		Replies	Views
How to extract body of a transformer like models and fine tune with that body on different shape dataset Convolutional Neural Networks coursera-platform	2	475	May 31, 2023
How to fine tune model on dataset which is having different shape compared to dataset model is trained on Neural Networks and Deep Learning coursera-platform	5	602	April 10, 2023
Need advice on reshaping the layer for a transformer Tensorflow model AI Discussions	1	174	June 17, 2022
Create a new model using tranfer learning. It works.... but why? Convolutional Neural Networks in TensorFlow week-module-2	3	486	March 31, 2023
Ex 3 in Transfer Learning with MobileNetV2 assignment Convolutional Neural Networks coursera-platform	1	403	August 4, 2023

How to extract body of a transformer like models and fine tune with that body on different data

Related topics