Suppose I am making a cat dog classifier using decision trees. If my model misclassifies all examples of cats as dogs and all examples of dogs as cats then the entropy will be zero or minimum. My model will consider this as best case because low entropy means that data is pure. How will my model know that it is doing a blunder?

I disagree. Entropy is a measure of the purity (or chaos) in a data set - not the predictions.

Perhaps I misunderstand your message. Can you provide a counter-example?