Location of initial random parameters relative to the loop for NNs

naveadjensen · November 6, 2022, 2:23am

Through other questions I’ve asked here, I’ve learned that it is a good idea to use this:

tf.random.set_seed(1234) # to initialize the parameters/weights.

before running through a NN (Sequential, .compile, .fit, and .predict). This makes sense. But, I have a question the situation in section “7 - Iterate to find optimal regularization value” in the class 2, week 3, assignment.

In that section, the parameters are initialized outside of the lambda for loop. If you move the
tf.random.set_seed(1234)
to inside of the loop, then the probabilities for the NN using the first lambda are the same whether you initialize in or out of the loop (which makes sense), but the probabilities for the NN’s beyond the first lambda differ. To make my question more concrete, I rewrote the script/loop here twice, and simplified the NN architecture and gave new training inputs to show what I mean. The first script/loop has the random initialization outside of the loop, like in the assignment, the second one has it inside of the loop. During both I capture the probabilities and then after both I compare the probabilities. My question is below that.

###########################################
# Make new inputs/training data, for faster run time:

X_train = np.array([[1,2.2,3.2,5],[5,.9,1.2,9.5],[4,3,6,9],[-.3,-2.5,1.1,5.1],[-.6,-2.6,10.1,5.1]])
y_train = np.array([1,1,0,1,2])

# Script/loop number 1:

tf.random.set_seed(1234)      # Outside of loop
lambdas = [0.01, 0.05]             # Only 2 lambdas now
models=[None] * len(lambdas)
probs_list_random_out_ofloop = []               # Added to capture the probabilities
for i in range(len(lambdas)):
    lambda_ = lambdas[i]
    models[i] =  Sequential( [
            Dense(8, activation = 'relu', kernel_regularizer=tf.keras.regularizers.l2(lambda_)),
            Dense(4, activation = 'relu', kernel_regularizer=tf.keras.regularizers.l2(lambda_)),
            Dense(3, activation = 'linear') ] )
    models[i].compile(
        loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
        optimizer=tf.keras.optimizers.Adam(0.01),)
    models[i].fit(X_train,y_train, epochs=20 )
    probs = tf.nn.sigmoid(models[i].predict(X_train)).numpy()
    probs_list_random_out_ofloop.append(probs)            # Capture probs here

# Script/loop number 2:

lambdas = [0.01, 0.05]                                     # Only 2 lambdas now
models=[None] * len(lambdas)
probs_list_random_inloop = []               # Added to capture the probabilities
for i in range(len(lambdas)):
    tf.random.set_seed(1234)      # Inside of loop
    lambda_ = lambdas[i]
    models[i] =  Sequential( [
            Dense(8, activation = 'relu', kernel_regularizer=tf.keras.regularizers.l2(lambda_)),
            Dense(4, activation = 'relu', kernel_regularizer=tf.keras.regularizers.l2(lambda_)),
            Dense(3, activation = 'linear') ] )
    models[i].compile(
        loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
        optimizer=tf.keras.optimizers.Adam(0.01),)
    models[i].fit(X_train,y_train, epochs=20 )
    probs = tf.nn.sigmoid(models[i].predict(X_train)).numpy()
    probs_list_random_inloop.append(probs)            # Capture probs here

print('first iteration prob comparison:\n',probs_list_random_out_ofloop[0] == probs_list_random_inloop[0])
print('first iteration prob comparison:\n',probs_list_random_out_ofloop[1] == probs_list_random_inloop[1])

###########################################

The second NN that is solved for, for the second lambda, has a different answer (different probabilities) depending on when you initialize the parameters.

Question: Does it matter where/when we initialize? To me it makes sense that you initialize the parameters before each .fit (within the loop). Any thoughts or explanation of this?

Thanks.
ps - apologies for not knowing how to put code into the question in a different format, and instead just pasting the text directly in.

rmwkwok · November 6, 2022, 3:14am

Hello Navead @naveadjensen!!

That’s a very good question!

I support to set the random seed inside the for loop, because in that way your model always begins with the same parameters, and then the only difference becomes the lambda value. Consequently, the only cause of the difference among the models would be the lambda value which aligns with the objective of the experiment.

Cheers,
Raymond

PS: I modified your post for formatting your code. To see what I have changed, please edit your post.

Topic		Replies	Views
Random Seed Changed in nn_model Neural Networks and Deep Learning coursera-platform	3	558	July 2, 2021
A question of random set in NN Unsupervised Learning, Recommenders, Reinforcement week-module-2	1	476	February 11, 2023
DLS Course 1- Week 3- ex3 - initialize parameters Neural Networks and Deep Learning coursera-platform	2	530	October 25, 2021
Initialization of weights for a neural net AI Discussions	8	174	December 11, 2021
Results not matching for linear_function in DLS Course 2 week 3 Improving Deep Neural Networks: Hyperparameter tun coursera-platform	6	600	May 6, 2021

Location of initial random parameters relative to the loop for NNs

Related topics