here you can find an explanation which I can recommend:

Source: Feature Crosses: Encoding Nonlinearity | Machine Learning | Google Developers

In this example you see a nonlinear problem which cannot be easily separated with a linear classifier: by crafting a new feature (also called feature engineering) x_3 = x_1 x_2 with the shown dimensions x_1 and x_2 you can create a new feature x_3.

In a simplified view for this specific example, assuming symmetry of the axis, the following applies for our new feature:

- negative * negative turns into positive
- positive * positive turns into positive
- negative * positive turns into negative
- positive * negative turns into negative

**All blue dots will be positive.**

**All orange dots will be negative.**

With x_3 you build a feature cross by multiplying two existing features. This allows learning a new weight, encoding nonlinear information in the features and by this you make the problem solvable with a linear classifier.

Here you find another example how to transform data, getting rid on non-linearity in the data space: Can we start with the circle equation as decision boundary? - #12 by Christian_Simonis

Best regards

Christian