Incorporating equations governing input-output pairs in neural networks

Nasim_Deljouyi · March 3, 2023, 9:26pm

Hi all. I am working on a neural network problem. The number of input features is 8 and the number of outputs is 25. There are many outputs that make the problem complicated. I have developed 6999 training examples, so X_train and Y_train dimensions are 6999 * 8 and 6999 * 25. I have developed a neural network as follows:

import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense

Model

model_1 = Sequential([
tf.keras.Input(shape=(n2,)),
Dense(16, activation = ‘relu’),
Dense(64, activation = ‘relu’),
Dense(128, activation = ‘relu’),
Dense(64, activation = ‘relu’),
Dense(25, activation = ‘relu’)
], name = “Model_1”)

Compile (Loss)

model_1.compile(loss = tf.keras.losses.MeanSquaredError(),
optimizer = tf.keras.optimizers.Adam(0.001))

Fit

model_1.fit(X_train,Y_train, epochs = 5000)

However, I know that for my input and outputs, some equations are true. I wrote the equations for the training data as follows.

Y_train[:, 1] + Y_train[:, 4] = X_train[:, 0]
Y_train[:, 2] + Y_train[:, 3] + Y_train[:, 7] + Y_train[:, 17] + Y_train[:, 22] = X_train[:, 1]
Y_train[:, 4] + Y_train[:, 5] + Y_train[:, 7] + Y_train[:, 8] + Y_train[:, 9] + Y_train[:, 19] = X_train[:, 2]
Y_train[:, 3] + Y_train[:, 10] + Y_train[:, 11] + Y_train[:, 13] = X_train[:, 3]
Y_train[:, 14] = X_train[:, 4]
Y_train[:, 5] + Y_train[:, 7] + Y_train[:, 10] + Y_train[:, 15] + Y_train[:, 17] + Y_train[:, 20] + Y_train[:, 22] = X_train[:, 5]
Y_train[:, 11] + Y_train[:, 16] + Y_train[:, 19] + Y_train[:, 21] = X_train[:, 6]
Y_train[:, 5] + Y_train[:, 7] + Y_train[:, 8] + Y_train[:, 20] + Y_train[:, 21] + Y_train[:, 22] + Y_train[:, 23] = X_train[:, 7]

I tried to use the idea of a physics-informed neural network and include these equations in my loss function, I wrote the following code:

import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense

Loss function

def costum_loss(X_loss):
def loss(y_true, y_pred):

    q1 = y_pred[:, 1] + y_pred[:, 4] - X_loss[:, 0]
    q2 = y_pred[:, 2] + y_pred[:, 3] + y_pred[:, 7] + y_pred[:, 17] + y_pred[:, 22] - X_loss[:, 1]
    q3 = y_pred[:, 4] + y_pred[:, 5] + y_pred[:, 7] + y_pred[:, 8] + y_pred[:, 9] + y_pred[:, 19] - X_loss[:, 2]
    q4 = y_pred[:, 3] + y_pred[:, 10] + y_pred[:, 11] + y_pred[:, 13] - X_loss[:, 3]
    q5 = y_pred[:, 14] - X_loss[:, 4]
    q6 = y_pred[:, 5] + y_pred[:, 7] + y_pred[:, 10] + y_pred[:, 15] + y_pred[:, 17] + y_pred[:, 20] + y_pred[:, 22] - X_loss[:, 5]
    q7 = y_pred[:, 11] + y_pred[:, 16] + y_pred[:, 19] + y_pred[:, 21] - X_loss[:, 6]
    q8 = y_pred[:, 5] + y_pred[:, 7] + y_pred[:, 8] + y_pred[:, 20] + y_pred[:, 21] + y_pred[:, 22] + y_pred[:, 23] - X_loss[:, 7]

    loss_t = tf.reduce_mean(tf.square(y_true - y_pred)) + \
    tf.reduce_mean(tf.square(q1)) + \
    tf.reduce_mean(tf.square(q2)) + tf.reduce_mean(tf.square(q3)) + \
    tf.reduce_mean(tf.square(q4)) + tf.reduce_mean(tf.square(q5)) + \
    tf.reduce_mean(tf.square(q6)) + tf.reduce_mean(tf.square(q7)) + \
    tf.reduce_mean(tf.square(q8))
    return loss_t
return loss

Model

model = Sequential([
tf.keras.Input(shape=(n2,)),
Dense(16, activation=‘relu’),
Dense(64, activation=‘relu’),
Dense(128, activation=‘relu’),
Dense(64, activation=‘relu’),
Dense(25, activation=‘relu’)
], name=“Model”)

Compile the model with the custom loss function

model.compile(loss = costum_loss(X_train), optimizer=tf.keras.optimizers.Adam(0.001))

Train the model

model.fit(X_train, Y_train, batch_size = 6999 ,epochs=5000)

Of course, still I need to work on the code and rewrite it so that I can use it for batch sizes other than the number of training examples, but for example even for the batch_size equal to the training examples my first model (before involving equations in loss function) works much better than the second one (after involving the equation in the loss function). Actually, the answer for the second model is so bad. I expected the model performance would be improved by forcing the outputs to satisfy those equations. For example, regarding equation number 5, I always want the 15th element of the input to be equal to the 5th element of the input. Do you think the problem is with the way of implementation (I am working to write it in another way) or just the idea does not work here because those sets of equations do not have a unique answer (the unknown variables are more than the number of equations)? Is there any other way that I can force the output of the model to satisfy those equations?

reinoudbosch · May 23, 2023, 2:27pm

Hi Nasim_Deljouyi,

This kind of issue is sometimes resolved by adding a memory component to a neural network. An example of this you can find here. You may google on adding memory component to neural network, and see if you can find some inspiration. I know they have been working on this at the allen institute in Seattle. Good luck!

Nasim_Deljouyi · May 26, 2023, 2:41pm

Thank you so much @reinoudbosch

Best,
Nasim

Topic		Replies	Views
Beginner's Tensorflow question: Can't get the dimensions for a simple linear equation right AI Discussions	1	35	May 18, 2023
Weight matrix dimension in TensorFlow Convolutional Neural Networks	1	914	September 21, 2022
C2_W1_Lab02_CoffeeRoasting Advanced Learning Algorithms week-1	3	509	February 7, 2023
Week 3 - programming assignement - layers sizes? Neural Networks and Deep Learning	4	566	June 25, 2021
Tensorflow hello world code piece Introduction to TF for Artificial Intelligence ... week-1	3	322	December 1, 2023

Incorporating equations governing input-output pairs in neural networks

Model

Compile (Loss)

Fit

Loss function

Model

Compile the model with the custom loss function

Train the model

Related topics