Model.fit() related question

Hoang_Bui · December 23, 2022, 9:00pm

I understand that you would have the initial starting weights and b of neurons in a same layer be randomized. However, when I tried to set the initial weights and biases to be the same, and run model.fit() after that, I would still get different weights after training. Does model.fit() automatically change the initial weights or do I understand the concept of gradient descent incorrectly?

Christian_Simonis · December 23, 2022, 9:21pm

Hi there,

mdl.fit() triggers the training process, meaning the model parameters (weights of your neural net) are getting adjusted and improved according to your defined optimization problem with all the defined hyperparameters (like learning rate, dropout rate, etc). Here gradient descent can be the algorithm to find your optimum. (You might find this article interesting on this note: https://hubs.la/Q01d6Bbc0)

To answer your question: the call of the fitting function triggers the training. Within the training the weights are updated and improved. After the training, you should also test your model with an independent and new data set.

Best regards
Christian

PS: An example where you can see this exactly (not with a neural net, but with a fit of a GP) is here - under „fitting the model“: Google Colab

TMosh · December 23, 2022, 9:33pm

It depends on where you added code to force the initial weights to all be the same. TensorFlow/Keras has the ability to automatically initialize the weights to random values when the model is created.

Christian_Simonis · December 23, 2022, 9:50pm

In addition:
@Hoang_Bui, I am not sure if you only ask about:

when the weights are being changed
or also about repeatability of your experiment (if so, feel free to provide a little more information)

Potentially, this thread might be interesting for you since it’s also about the initialisation. Feel free to check it out:

Best regards
Christian

rmwkwok · December 24, 2022, 2:16am

What did you do to try to achieve that?

The following is what I would do:

import numpy as np
import tensorflow as tf
import random as python_random

def set_seed(seed=100):
    np.random.seed(seed)
    python_random.seed(seed)
    tf.random.set_seed(seed)

and I call set_seed() every time right before I construct a model.

Raymond

Hoang_Bui · December 24, 2022, 2:31am

Hi, I might not have explained it clearly. From my observation from one of the lab, if we print the initial weights and bias before training the model, we would get the weight and bias for every neuron to be some arbitrary numbers. So what I did was this:

I set the W (the np arrays of the neurons in a single layer) to be all 0, as well as the biases for those neurons. For this, I just simply uses set_weights() to set W to be all 0 and biases to be all 0
I then ran model.compile() and model.fit().

What you would expect from training the model, implementing manual gradient descent, is that all of the weights and biases for those neurons would be the same as they start at the same starting point. However, I got different weights and biases for each of the neuron, and not sure if the automatic training of Tensorflow would fix the initialized weights and biases that I have set.

rmwkwok · December 24, 2022, 3:58am

can you share your notebook in a private direct message (clicking my profile, and click “Message”). I could take a look at how you experimented it, and see if I can find some reasons.

Raymond

Topic		Replies	Views
Initialization of weights for a neural net AI Discussions	8	159	December 11, 2021
Week 1, Programming Assignment initialization, Exercise 1 - initialize_parameters_zeros Improving Deep Neural Networks: Hyperparameter tun	8	829	December 15, 2023
Question in intro video Advanced Learning Algorithms week-1	4	15	December 20, 2024
W4_A1_dAL initialization in L_model_backward Neural Networks and Deep Learning	4	518	December 2, 2022
How does Random Initialization prevent convergence? Neural Networks and Deep Learning	1	553	July 7, 2021

Model.fit() related question

Related topics