Since we have this function, and we want to maximise its likely hood to win the prize (this is what instructor told, hypothetically I am not asking to give me a prize).

The function is called maximum log likely-hood function. Also could you help me understand meaning and usage of negative log-likelyhood?

What I think is that probabilities are often (0, 1) this is closed interval (sets and relations).

And there is a well defined identity of log function \ln(1) = 0 , \ln(0) \to \text{undefined}, and \ln(e) = 1 which means \ln(x) < 0, where 0 < x < 1 for sure. Since error canâ€™t be negative (because common sense and maths says the minimum error is -\infty therefore it needs to be positive real number \mathbb{R^+}, this is why we multiply with -1 and its called negative log loss.

So we maximise log-likelyhood (hence named MLL), and minimise log-loss, when multiplied by -1 (hence NLL)

Now followup question, just multiplying with -1, my its name changed from likelyhood to loss?