How does a deep learning algorithm figure out the features by itself as Andrew said in the lecture/

Andrewâ€™s lecture is aimed to give an intuition about how the internals of a neural network operate.

In practice the activations of the hidden layer donâ€™t have any real physical meaning - theyâ€™re not really â€śaffordabilityâ€ť, â€śawarenessâ€ť and â€śqualityâ€ť. Theyâ€™re simply whatever activations will lead to minimizing the total cost.

So the NNâ€™s hidden layer learns new features (by using the non-linear combinations of the input features), but we donâ€™t really know what they mean. Theyâ€™re just mathematical entities that summarize the characteristics of the input, and lead to minimizing the cost on the output.

Does it mean it finds out the best combination by hit and trial?

No, not at all.

Training is via a mathematical process called backpropagation.

Itâ€™s discussed later in the course.

Ok, Iâ€™ve heard about it somewhere. I meant to say that does the network layer try all the possible combination to choose the best feature?

No. The weight values are learned from a training set. Thatâ€™s no different than linear or logistic regression.