Loss values do not change

Nasim_Deljouyi · May 4, 2023, 1:08am

Hi Learners. I have developed a simple deep-learning model for a simple training example that comes from y=2*x -1. However, most of the time when I run the model, the loss values do not change and stick with a specific value. It happens 6 or 7 times from 10 times of running. This happens even I change the number of layers and neurons, learning epochs, and learning rate. Does anyone know the reason behind it?

My entire code is as follows:

import numpy as np
x = np.array([[-1.0, 0.0 , 1.0, 2.0, 3.0, 4.0, 5.0]])
y = np.array([[-3.0, -1.0, 1.0, 3.0, 5.0, 7.0, 9.0]])

x = np.reshape(x,(7,1))
y = np.reshape(y,(7,1))

from tensorflow import keras
from keras import Model
from keras.models import Sequential
from keras.layers import Input, Dense

input_1 = Input(shape=(1,))
dense_1 = Dense(4, activation=‘relu’)(input_1)
dense_2 = Dense(2, activation=‘relu’)(dense_1)
dense_3 = Dense(1, activation=‘relu’)(dense_2)
model_2 = Model(inputs= input_1, outputs = dense_3)

model_2.compile(loss = keras.losses.MeanSquaredError(),
optimizer = keras.optimizers.Adam(0.001))

model_2.fit(x,y, epochs=500)
print(model_2.predict([10]))

TMosh · May 4, 2023, 1:59am

One issue is that an NN with three hidden layers may be totally puzzled by how simple the problem you’ve asked it to solve is.

The cost functions for NN’s have local minima, and you may not always find the lowest-cost solution.

All you need for this example is linear regression, no hidden layers, and no ReLU.

Perhaps try a more complicated data set (maybe a nice parabola), and only use one hidden layer. Give that a try. A parabola is non-linear, and that’s what hidden layers are good at solving.

ai_curious · May 4, 2023, 9:54am

For straightforward relationships, even the parabola, it’s unlikely you’ll need anything close to 500 epochs, either. Or even 50. My recommendation would be to dial that way back. Early experiments could use 5 or 10. They will complete quickly, then, if the model seems to be learning, you can add more and see what extra benefit they provide. You can also read about implementing your own callbacks that can stop training when accuracy reaches a certain threshold.

Nasim_Deljouyi · May 4, 2023, 4:24pm

Yes, you are right; I did not pay attention that I was using ‘relu’ instead of ‘linear’, and it solved this problem. I also removed the hidden layers. However, I still get different results each time that I run. I know this is because of the stochastic nature of the algorithm, but among 10 runs, there were only 2 runs that ended up with the result I expected within 1500 epochs, and the other 8 runs ended up with high loss value, like over 15.

This is an easy problem and I know what I expect as the output, but for more complex problems I am not able to distinguish between the outputs and even run many many times. I would greatly appreciate it if you could advise me on this.

Also the following is the updated code:

import numpy as np
x = np.array([[-1.0, 0.0 , 1.0, 2.0, 3.0, 4.0, 5.0]])
y = np.array([[-3.0, -1.0, 1.0, 3.0, 5.0, 7.0, 9.0]])

x = np.reshape(x,(7,1))
y = np.reshape(y,(7,1))

from tensorflow import keras
from keras import Model
from keras.models import Sequential
from keras.layers import Input, Dense

input_1 = Input(shape=(1,))

dense_1 = Dense(1, activation=‘linear’)(input_1)

dense_2 = Dense(2, activation=‘relu’)(dense_1)

dense_3 = Dense(1, activation=‘linear’)(input_1)
model_2 = Model(inputs= input_1, outputs = dense_3)

model_2.compile(loss = keras.losses.MeanSquaredError(),
optimizer = keras.optimizers.Adam(0.001))

model_2.fit(x,y, epochs=1500)
print(model_2.predict([10]))

Nasim_Deljouyi · May 4, 2023, 4:38pm

Thanks for your response. Actually, the problem I had was that the loss value was not changing at all. I mean from the first epoch to whatever the ending epoch was. I changed the activation and removed hidden layers, but it also needed more epochs for learning. I changed the number of epochs to 1500 and see again most of the time the learning is finished with a high value of loss. I don’t know why, but when I used to code machine learning algorithms whole by myself and only using numpy, I did not face with such issues.

ai_curious · May 4, 2023, 4:40pm

I’m looking at your code right now. I’ll have more in a few minutes

ai_curious · May 4, 2023, 4:47pm

So a couple things I notice, without trying to get it to run yet. First, you import Sequential, but don’t use it to define the model. Second, you have input_1 as the input to dense_3, so the other Dense layers aren’t connected to the output of the model.

Nasim_Deljouyi · May 4, 2023, 4:50pm

Yes, I commented the dense layers, and does it mind if I import sequential but not using it? I actually wanted to use it to write the model using sequential.

TMosh · May 4, 2023, 4:55pm

Your data set is still a simple linear function.

Nasim_Deljouyi · May 4, 2023, 5:07pm

Thanks a lot. I changed the optimizer to ‘sgd’ and it worked.

ai_curious · May 4, 2023, 5:15pm

from tensorflow import keras
from keras import Model
from keras.models import Sequential
from keras.layers import Input, Dense

import numpy as np


x = np.array([-1.0, 0.0 , 1.0, 2.0, 3.0, 4.0, 5.0])
y = np.array([-3.0, -1.0, 1.0, 3.0, 5.0, 7.0, 9.0])

model = Sequential(
    [ 
        Dense(units = 1, input_shape=[1]),
    ])

model.compile(loss = 'mean_squared_error',
              optimizer = 'sgd',
              metrics=["mse"])

model.fit(x,y, epochs=15)
print(model.predict([7]))

that should be enough to get a demonstration of concept running, though as @TMosh points out, it’s not a good use of the horsepower of a neural net

Nasim_Deljouyi · May 4, 2023, 5:26pm

Really appreciate your help @ai_curious and @TMosh.

Best,
Nasim

Topic		Replies	Views
W3_A1 nn_model() same cost value Neural Networks and Deep Learning week-3	5	177	April 10, 2024
Loss value changes from one run to the other AI Discussions	1	134	November 22, 2023
When I add more training valus neural network can not find a solution. Why? Introduction to TF for Artificial Intelligence ... week-1	16	393	December 9, 2023
WHen does keras model.fit update weights AI Discussions	3	75	February 20, 2023
Week 3, Programming Assignment, Exercise 8 Neural Networks and Deep Learning	8	640	May 25, 2022

Loss values do not change

dense_1 = Dense(1, activation=‘linear’)(input_1)

dense_2 = Dense(2, activation=‘relu’)(dense_1)

Related topics