Can decision tree be used for multiclass classification problems?

If yes, can you suggest me some references to better understand the algorithm?

Suppose that I have 3 classes: dog, cat, horse.
I am not able to generalize what is explained in the lectures, how can I calculate entropy with more than 2 classes?

Generally, for each sample, you calculate one negative log probability of the true class the sample belongs to, then you add up all negative log probabilities.

Please google “cross-entropy” or with additional keywords such as “3 classes example”.