Through other questions I’ve asked here, I’ve learned that it is a good idea to use this:

tf.random.set_seed(1234) **#** to initialize the parameters/weights.

before running through a NN (Sequential, .compile, .fit, and .predict). This makes sense. But, I have a question the situation in section “7 - Iterate to find optimal regularization value” in the class 2, week 3, assignment.

In that section, the parameters are initialized outside of the lambda for loop. If you move the

tf.random.set_seed(1234)

to inside of the loop, then the probabilities for the NN using the first lambda are the same whether you initialize in or out of the loop (which makes sense), but the probabilities for the NN’s beyond the first lambda differ. To make my question more concrete, I rewrote the script/loop here twice, and simplified the NN architecture and gave new training inputs to show what I mean. The first script/loop has the random initialization outside of the loop, like in the assignment, the second one has it inside of the loop. During both I capture the probabilities and then after both I compare the probabilities. My question is below that.

###########################################

**# Make new inputs/training data, for faster run time:**

```
X_train = np.array([[1,2.2,3.2,5],[5,.9,1.2,9.5],[4,3,6,9],[-.3,-2.5,1.1,5.1],[-.6,-2.6,10.1,5.1]])
y_train = np.array([1,1,0,1,2])
```

**# Script/loop number 1:**

```
tf.random.set_seed(1234) # Outside of loop
lambdas = [0.01, 0.05] # Only 2 lambdas now
models=[None] * len(lambdas)
probs_list_random_out_ofloop = [] # Added to capture the probabilities
for i in range(len(lambdas)):
lambda_ = lambdas[i]
models[i] = Sequential( [
Dense(8, activation = 'relu', kernel_regularizer=tf.keras.regularizers.l2(lambda_)),
Dense(4, activation = 'relu', kernel_regularizer=tf.keras.regularizers.l2(lambda_)),
Dense(3, activation = 'linear') ] )
models[i].compile(
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
optimizer=tf.keras.optimizers.Adam(0.01),)
models[i].fit(X_train,y_train, epochs=20 )
probs = tf.nn.sigmoid(models[i].predict(X_train)).numpy()
probs_list_random_out_ofloop.append(probs) # Capture probs here
```

**# Script/loop number 2:**

```
lambdas = [0.01, 0.05] # Only 2 lambdas now
models=[None] * len(lambdas)
probs_list_random_inloop = [] # Added to capture the probabilities
for i in range(len(lambdas)):
tf.random.set_seed(1234) # Inside of loop
lambda_ = lambdas[i]
models[i] = Sequential( [
Dense(8, activation = 'relu', kernel_regularizer=tf.keras.regularizers.l2(lambda_)),
Dense(4, activation = 'relu', kernel_regularizer=tf.keras.regularizers.l2(lambda_)),
Dense(3, activation = 'linear') ] )
models[i].compile(
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
optimizer=tf.keras.optimizers.Adam(0.01),)
models[i].fit(X_train,y_train, epochs=20 )
probs = tf.nn.sigmoid(models[i].predict(X_train)).numpy()
probs_list_random_inloop.append(probs) # Capture probs here
print('first iteration prob comparison:\n',probs_list_random_out_ofloop[0] == probs_list_random_inloop[0])
print('first iteration prob comparison:\n',probs_list_random_out_ofloop[1] == probs_list_random_inloop[1])
```

###########################################

The second NN that is solved for, for the second lambda, has a different answer (different probabilities) depending on when you initialize the parameters.

**Question:** Does it matter where/when we initialize? To me it makes sense that you initialize the parameters before each .fit (within the loop). Any thoughts or explanation of this?

Thanks.

ps - apologies for not knowing how to put code into the question in a different format, and instead just pasting the text directly in.