Model.fit() related question

I understand that you would have the initial starting weights and b of neurons in a same layer be randomized. However, when I tried to set the initial weights and biases to be the same, and run model.fit() after that, I would still get different weights after training. Does model.fit() automatically change the initial weights or do I understand the concept of gradient descent incorrectly?

Hi there,

mdl.fit() triggers the training process, meaning the model parameters (weights of your neural net) are getting adjusted and improved according to your defined optimization problem with all the defined hyperparameters (like learning rate, dropout rate, etc). Here gradient descent can be the algorithm to find your optimum. (You might find this article interesting on this note: https://hubs.la/Q01d6Bbc0)

To answer your question: the call of the fitting function triggers the training. Within the training the weights are updated and improved. After the training, you should also test your model with an independent and new data set.

Best regards
Christian

PS: An example where you can see this exactly (not with a neural net, but with a fit of a GP) is here - under „fitting the model“: Google Colab

It depends on where you added code to force the initial weights to all be the same. TensorFlow/Keras has the ability to automatically initialize the weights to random values when the model is created.

In addition:
@Hoang_Bui, I am not sure if you only ask about:

  • when the weights are being changed
  • or also about repeatability of your experiment (if so, feel free to provide a little more information)

Potentially, this thread might be interesting for you since it’s also about the initialisation. Feel free to check it out:

Best regards
Christian

What did you do to try to achieve that?

The following is what I would do:

import numpy as np
import tensorflow as tf
import random as python_random

def set_seed(seed=100):
    np.random.seed(seed)
    python_random.seed(seed)
    tf.random.set_seed(seed)

and I call set_seed() every time right before I construct a model.

Raymond

1 Like

Hi, I might not have explained it clearly. From my observation from one of the lab, if we print the initial weights and bias before training the model, we would get the weight and bias for every neuron to be some arbitrary numbers. So what I did was this:

  1. I set the W (the np arrays of the neurons in a single layer) to be all 0, as well as the biases for those neurons. For this, I just simply uses set_weights() to set W to be all 0 and biases to be all 0
  2. I then ran model.compile() and model.fit().

What you would expect from training the model, implementing manual gradient descent, is that all of the weights and biases for those neurons would be the same as they start at the same starting point. However, I got different weights and biases for each of the neuron, and not sure if the automatic training of Tensorflow would fix the initialized weights and biases that I have set.

can you share your notebook in a private direct message (clicking my profile, and click “Message”). I could take a look at how you experimented it, and see if I can find some reasons.

Raymond