Feature engineering in the input layer?

What about employing feature engineering right in the begining , namely in the input layer?
Is the feature engineering not needed in the input layer or this is exactly what the hidden layers are doing automatically?
Another question: are the neurons in the hidden layers mathematical derivations of the inputs just like the features engineered by using for example length and width of a house to engineer the area?

Hi @Vahdet_Vural ,

Regarding your first question on Feature Engineering:

You can definitively use feature engineering in the input layer. In fact, feature engineering is an important step in the design of the model.

When doing feature engineering you will be selecting and transforming your input features in a way that helps the model learn more effectively.

Feature Engineering can include techniques like normalizing the data, creating new features by combining existing ones, or removing irrelevant or redundant features.


1 Like

Regarding your second question:

The neurons in the layers initially start as the result of a linear transformation (W * a +b) followed by a non-linear transformation (the selected activation function, like ReLU or tanh or any other). In this very first stage, the W of each layer is randomly initialized (there are several algorithms to initialize Ws and b). In the very first hidden layer, the formula would be W * X + b, so this very first hidden layer will take the input, which would include all the features, including those engineered features.

One the training takes effect, the back propagation, which starts with a cost function and goes layer by layer from end to start with derivatives that calculate the gradients, will update the values of W and b in each layer, among other things.

Does it make sense?



Hey Juan,
my sincere thanks. you have helped make it clearer for me.

1 Like

In addition to the good answers: This thread might be of interest with respect to feature engineering:


One of the big motivations for neural networks is that you don’t need to employ feature engineering necessarily, as the hidden layers will find properties of the input data for you. When you introduce feature engineering, you are putting your input data through non-linearities so you can find a hyperplane that linearly separates your data. This is essentially what a neural network does through the use of non-linear activation functions. In essence, the neural network absolves the feature engineering part of the problem from the developer.

Hello @Hokie81!

It’s me again :wink: I completely agree with you that NN is by itself doing feature engineering and it is also what makes NN so powerful!

On the other hand, if I know I can engineer some features myself by multiplication that will help the model, I won’t hesistate to do it, because there is also limitations such as that NN is not the best in multiplying. We know a Dense layer is only weighted-adding some features up.

Now I am not saying NN can’t handle multiplication, instead, with enough neurons NN may still be able to resemble multiplied features through additions. However, we can save some time if we just give NN our own engineered features.