Help! What is the base of log in logistic regression activation function?

When learning math, if we say log(y), we will think by default that is loge(y). That is, the base of log is e. But I heard that it is different in deep learning. Can anyone give a explain about what is the base of log in deep learning.
Thanks a lot.

Hi @Carrie_Young,

I believe log base e is used in DLS. I’m afraid I can’t be of much use here, as I don’t remember where, but Andrew mentions it somewhere in one of the lecture videos.

@paulinpaloalto, @kenb, can you please confirm ?


1 Like

Hi, @Carrie_Young !

As a quick note, loge or ln is normally used. Check tf documentation for further explanation.


Confirmed, Mubsi. The so-called “natural logarithm” with base e=2.718 ... is used exclusively
in the DLS (if memory serves!). Base 2 logarithms have their uses in information theory, which could be part of the machine learning universe, but not here.


Yes, the notation in the ML/DL world is different than in the math world, which looks confusing at first. In math, log means log base 10 and they use ln for natural log. But in the ML/DL world, log always means natural logarithm. The reason that natural log is preferred is clear once you get into the algorithms: we’re using logs for the loss function and the key point is we need to take the derivatives of the loss functions to implement back propagation, which is the fundamental technique on which learning is built. The beauty of using base e is that the derivatives are nice and clean. If you use log_{10}, then you’d get an extra constant factor every time you take the derivative. And the fundamental mathematical properties are the same, meaning that you would get no advantage from using base 10 logs and it would just make a mess with all the extra constant factors flying around.

I’m not sure historically where this notational difference arose, but one theory would be that 10 or 15 years ago, a lot of the ML work was done using MATLAB and in MATLAB the function names are log for natural log and log10 for base 10 logs. E.g. Prof Ng’s original Stanford Machine Learning course, which I think was first published in 2011 or maybe 2012, used MATLAB as the implementation language, so it would have been natural (pun intended :nerd_face:) to use the same function name that MATLAB used. Of course these days we’re using python and it’s the same there: np.log is natural log and np.log10 is base 10 log.