Hello @colakhalil! It seems you put a lot of effort to explain your point. I like that. So, I did a quick code (ChatGPT did that), three hidden layers, weight initialized with zeroes, and output layer weight initialized randomly. Here is the result of 10 iterations.
Iteration 0
hidden1 weights:
[[0. 0. 0.]]
hidden2 weights:
[[0. 0. 0.]
[0. 0. 0.]
[0. 0. 0.]]
hidden3 weights:
[[0. 0. 0.]
[0. 0. 0.]
[0. 0. 0.]]
output weights:
[[0.01062255]
[0.02110083]
[0.03210288]]
Iteration 1
hidden1 weights:
[[0. 0. 0.]]
hidden2 weights:
[[0. 0. 0.]
[0. 0. 0.]
[0. 0. 0.]]
hidden3 weights:
[[0. 0. 0.]
[0. 0. 0.]
[0. 0. 0.]]
output weights:
[[0.01062255]
[0.02110083]
[0.03210288]]
Iteration 2
hidden1 weights:
[[0. 0. 0.]]
hidden2 weights:
[[0. 0. 0.]
[0. 0. 0.]
[0. 0. 0.]]
hidden3 weights:
[[0. 0. 0.]
[0. 0. 0.]
[0. 0. 0.]]
output weights:
[[0.01062255]
[0.02110083]
[0.03210288]]
Iteration 3
hidden1 weights:
[[0. 0. 0.]]
hidden2 weights:
[[0. 0. 0.]
[0. 0. 0.]
[0. 0. 0.]]
hidden3 weights:
[[0. 0. 0.]
[0. 0. 0.]
[0. 0. 0.]]
output weights:
[[0.01062255]
[0.02110083]
[0.03210288]]
Iteration 4
hidden1 weights:
[[0. 0. 0.]]
hidden2 weights:
[[0. 0. 0.]
[0. 0. 0.]
[0. 0. 0.]]
hidden3 weights:
[[0. 0. 0.]
[0. 0. 0.]
[0. 0. 0.]]
output weights:
[[0.01062255]
[0.02110083]
[0.03210288]]
Iteration 5
hidden1 weights:
[[0. 0. 0.]]
hidden2 weights:
[[0. 0. 0.]
[0. 0. 0.]
[0. 0. 0.]]
hidden3 weights:
[[0. 0. 0.]
[0. 0. 0.]
[0. 0. 0.]]
output weights:
[[0.01062255]
[0.02110083]
[0.03210288]]
Iteration 6
hidden1 weights:
[[0. 0. 0.]]
hidden2 weights:
[[0. 0. 0.]
[0. 0. 0.]
[0. 0. 0.]]
hidden3 weights:
[[0. 0. 0.]
[0. 0. 0.]
[0. 0. 0.]]
output weights:
[[0.01062255]
[0.02110083]
[0.03210288]]
Iteration 7
hidden1 weights:
[[0. 0. 0.]]
hidden2 weights:
[[0. 0. 0.]
[0. 0. 0.]
[0. 0. 0.]]
hidden3 weights:
[[0. 0. 0.]
[0. 0. 0.]
[0. 0. 0.]]
output weights:
[[0.01062255]
[0.02110083]
[0.03210288]]
Iteration 8
hidden1 weights:
[[0. 0. 0.]]
hidden2 weights:
[[0. 0. 0.]
[0. 0. 0.]
[0. 0. 0.]]
hidden3 weights:
[[0. 0. 0.]
[0. 0. 0.]
[0. 0. 0.]]
output weights:
[[0.01062255]
[0.02110083]
[0.03210288]]
Iteration 9
hidden1 weights:
[[0. 0. 0.]]
hidden2 weights:
[[0. 0. 0.]
[0. 0. 0.]
[0. 0. 0.]]
hidden3 weights:
[[0. 0. 0.]
[0. 0. 0.]
[0. 0. 0.]]
output weights:
[[0.01062255]
[0.02110083]
[0.03210288]]
Iteration 10
hidden1 weights:
[[0. 0. 0.]]
hidden2 weights:
[[0. 0. 0.]
[0. 0. 0.]
[0. 0. 0.]]
hidden3 weights:
[[0. 0. 0.]
[0. 0. 0.]
[0. 0. 0.]]
output weights:
[[0.01062255]
[0.02110083]
[0.03210288]]
Below is the code if you want to experiment. For 1000 iterations, I see no symmetry breaking.
import tensorflow as tf
import numpy as np
# Generate example data
X = np.linspace(-2 * np.pi, 2 * np.pi, 100)
Y = np.sin(X)
# Create a sequential model
model = tf.keras.Sequential([
tf.keras.layers.Dense(3, activation='relu', kernel_initializer=tf.keras.initializers.Zeros(), input_shape=(1,), name='hidden1'),
tf.keras.layers.Dense(3, activation='relu', kernel_initializer=tf.keras.initializers.Zeros(), name='hidden2'),
tf.keras.layers.Dense(3, activation='relu', kernel_initializer=tf.keras.initializers.Zeros(), name='hidden3'),
tf.keras.layers.Dense(1, kernel_initializer=tf.keras.initializers.RandomUniform(minval=-0.1, maxval=0.1), name='output')
])
# Compile the model
model.compile(optimizer='adam', loss='mean_squared_error')
# Prepare the data
X_train = X.reshape(-1, 1)
Y_train = Y.reshape(-1, 1)
# Function to print weights in a readable format
def print_weights(model, iteration):
print(f"\nIteration {iteration + 1}")
for layer in model.layers:
weights = layer.get_weights()
if weights:
print(f"{layer.name} weights:")
print(weights[0])
print()
# Display initial weights
print("Initial Weights")
print_weights(model, -1)
# Train the model for 10 iterations
for iteration in range(10):
model. Fit(X_train, Y_train, epochs=10, verbose=0)
print_weights(model, iteration)
Iteration 998
hidden1 weights:
[[0. 0. 0.]]
hidden2 weights:
[[0. 0. 0.]
[0. 0. 0.]
[0. 0. 0.]]
hidden3 weights:
[[0. 0. 0.]
[0. 0. 0.]
[0. 0. 0.]]
output weights:
[[-0.07681055]
[ 0.02625201]
[ 0.07456041]]
Iteration 999
hidden1 weights:
[[0. 0. 0.]]
hidden2 weights:
[[0. 0. 0.]
[0. 0. 0.]
[0. 0. 0.]]
hidden3 weights:
[[0. 0. 0.]
[0. 0. 0.]
[0. 0. 0.]]
output weights:
[[-0.07681055]
[ 0.02625201]
[ 0.07456041]]
Iteration 1000
hidden1 weights:
[[0. 0. 0.]]
hidden2 weights:
[[0. 0. 0.]
[0. 0. 0.]
[0. 0. 0.]]
hidden3 weights:
[[0. 0. 0.]
[0. 0. 0.]
[0. 0. 0.]]
output weights:
[[-0.07681055]
[ 0.02625201]
[ 0.07456041]]