Can decision tree be used for multiclass classification problems?

If yes, can you suggest me some references to better understand the algorithm?

Suppose that I have 3 classes: dog, cat, horse.
I am not able to generalize what is explained in the lectures, how can I calculate entropy with more than 2 classes?

Thank you.

Generally, for each sample, you calculate one negative log probability of the true class the sample belongs to, then you add up all negative log probabilities.

Please google “cross-entropy” or with additional keywords such as “3 classes example”.