Just want to make sure that is the logistic loss function or binary cross entropy are derived from maximum likelihood estimation? And cross entropy with more than two categories is softmax? Does softmax derived from maximum likelihood too?

The logistic regression you classify between 1 class or 2 classesâ€¦ if 1 class you classify if it happen(occurred) or in the other words if the probability of this class is larger than the threshold you choose that this is happened(occurred like yes or no)

If you classify more than 2 classes all possible cases can be add include null .you take the class with the highest probability

Appreciate your answer. But i just want to make sure that the cross entropy function is derived from maximum likelihood estimation?

Well, what can be shown for sure is that â€śmaximizing likelihoodâ€ť is equivalent to â€śminimizing the cross-entropyâ€ť even in softmax case. The equivalence is very generic and thus the answer to your quesion is yes. But I personally prefer not to say one of them is derived from the other which may cause falling in the loop of axiomatic formulation of probability and statistics theory.