I was playing around with initial weights in C2_W2_Multiclass_TF and I expected that setting all the weights to 0 before training would result in all the units getting trained to the same parameters, as explained in DeepLearning Initializing Neural Networks : “Initializing all the weights with zeros leads the neurons to learn the same features during training.”
But that didn’t happen! After setting all the weights to 0, then training, the training result was different parameters for each unit and it found approximately the same parameters as the original training (before setting initial weights to 0).
How did it do that? What is happening?! Please help me if you can, this is driving me completely nuts.
model = Sequential(
[
Dense(2, activation = 'relu', name = "L1"),
Dense(4, activation = 'linear', name = "L2")
]
)
model.compile(
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
optimizer=tf.keras.optimizers.Adam(0.01),
)
# Model wont allow .set_weights until this is ran once
model.fit(
X_train,y_train,
epochs=200
)
# Look at the resulting weights
model.get_weights()
[array([[ 1.22, 0.6 ],
[ 0.92, -1.7 ]], dtype=float32),
array([1.59, 1.5 ], dtype=float32),
array([[-2.01, -3.07, 1.3 , 0.33],
[-2.83, 1.09, -1.89, 0.69]], dtype=float32),
array([ 3.18, 0.21, -1.31, -2.61], dtype=float32)]
# Set all the weights to 0 for all units in all layers
model.set_weights([
np.array([[ 0.0, 0.0 ],
[ 0.0, 0.0 ]]),
np.array([0.0, 0.0 ]),
np.array([[0.0, 0.0, 0.0 , 0.0],
[0.0, 0.0, 0.0, 0.0]]),
np.array([ 0.0, 0.0, 0.0, 0.0])
])
# Re-train
model.fit(
X_train,y_train,
epochs=100
)
# Print the new weights
model.get_weights()
# Resulting weights
[array([[ 1.28, 0.45],
[ 0.71, -1.69]], dtype=float32),
array([1.48, 1.51], dtype=float32),
array([[-2.06, -2.33, 1.11, 0.42],
[-1.9 , 1.3 , -1.94, 0.59]], dtype=float32),
array([ 2.47, -0.52, -0.46, -2.14], dtype=float32)]