Consequences of unbalanced input variable types

Manu · June 11, 2022, 7:01am

For the sake of simplicity, let say we have for input variables (X):

5 continuous variables normalized to be in the range between 0 and 1
10 binary variables, coded either 0 or 1

What is the consequence of such an unbalance?

The values of the 5 normalized continous variable are much more precise, having fine grain values like 0.2, 0.5, 0.7, reaching only in exceptional cases the maximal value of 1
On the other hand, having 10 binary variables outputing either the maximum value of 1 or the minum value of 0, might strongly disort and cover the fine grain value brought by the 5 normalized continuous variables

Any solutions or thoughts about this ?

balaji.ambresh · June 11, 2022, 7:28am

The results depend on the problem at hand. There’s nothing implicitly wrong in having more of continuous / categorical variables.
Please read this link to get a better idea on how to choose the features for your model.

Manu · June 13, 2022, 2:55pm

Thank you for this very interesting link @balaji.ambresh .

My concern was not really the number of them but a potential unbalance in the type. I trend to apply the same principles as in classical stats: the inclusion of independent variable or features/predictor for us should be theoretically grounded to avoid any spurious relationships, so I guess I should be fine with this.

After some thoughts, my concern might not be justified, as neurons will handle each features separetely and the later complexity of links between them should be able to ponder such relative unbalance.

Topic		Replies	Views
Discrete input Data (Xs)? Improving Deep Neural Networks: Hyperparameter tun coursera-platform	8	553	July 18, 2022
Data input normalization Improving Deep Neural Networks: Hyperparameter tun coursera-platform	5	592	April 29, 2021
Ambiguities regarding data selections for features Advanced Learning Algorithms week-module-3	1	379	July 19, 2023
How to feed a 360° orientation variable into a neural network AI Discussions ai-discussions	2	74	June 11, 2022
Understanding "Input Features" Neural Networks and Deep Learning coursera-platform	2	1237	December 6, 2021

Consequences of unbalanced input variable types

Related topics