How can a hidden layer optimize input features combination?

In the lecture, I learned that a hidden layer can optimize the initial input features, and produce a set of more relevant features for the model. I also learned that every unit of a hidden layer will take all the input features or all the activations of the last layer as its own input. But in most of the initial examples, all the units in a layer are set to the same activation algorithm and no other differences, e.g,

model = Sequential(
[
tf.keras.Input(shape=(2,)),
Dense(3, activation=‘sigmoid’, name = ‘layer1’),
Dense(1, activation=‘sigmoid’, name = ‘layer2’)
]
)

Base on the situation mentioned above, I think in a hidden layer, all the units may produce similar or the same results. So what’s the function of multiple units? How can a hidden layer optimize the input features combination?

I know that’s not the case. Units in a hidden layer will produce different results. But why? Based on their same activation algorithm, same inputs, without any other differences.

I hope mentors or classmates can kindly explain this for me. Thanks a lot!

Hello William @spaceking ,

Please check this discussion out and in which you will find another link to an example of how neurons behave differently with respect to the same set of input features or input data. Let me know if you have a different concern or if you have other question.

Cheers,
Raymond

Thanks,Ramand. Your posts are really helpful.

You are welcome William, and happy learning!