When learning math, if we say log(y), we will think by default that is loge(y). That is, the base of log is e. But I heard that it is different in deep learning. Can anyone give a explain about what is the base of log in deep learning.

Thanks a lot.

Hi @Carrie_Young,

I believe log base e is used in DLS. Iâ€™m afraid I canâ€™t be of much use here, as I donâ€™t remember where, but Andrew mentions it somewhere in one of the lecture videos.

@paulinpaloalto, @kenb, can you please confirm ?

Thanks,

Mubsi

Hi, @Carrie_Young !

As a quick note, loge or ln is normally used. Check tf documentation for further explanation.

Confirmed, Mubsi. The so-called â€śnatural logarithmâ€ť with base e=2.718 ... is used exclusively

in the DLS (if memory serves!). Base 2 logarithms have their uses in information theory, which could be part of the machine learning universe, but not here.

Yes, the notation in the ML/DL world is different than in the math world, which looks confusing at first. In math, `log`

means log base 10 and they use `ln`

for natural log. But in the ML/DL world, `log`

always means natural logarithm. The reason that natural log is preferred is clear once you get into the algorithms: weâ€™re using logs for the loss function and the key point is we need to take the derivatives of the loss functions to implement back propagation, which is the fundamental technique on which learning is built. The beauty of using base e is that the derivatives are nice and clean. If you use log_{10}, then youâ€™d get an extra constant factor every time you take the derivative. And the fundamental mathematical properties are the same, meaning that you would get no advantage from using base 10 logs and it would just make a mess with all the extra constant factors flying around.

Iâ€™m not sure historically where this notational difference arose, but one theory would be that 10 or 15 years ago, a lot of the ML work was done using MATLAB and in MATLAB the function names are `log`

for natural log and `log10`

for base 10 logs. E.g. Prof Ngâ€™s original Stanford Machine Learning course, which I think was first published in 2011 or maybe 2012, used MATLAB as the implementation language, so it would have been natural (pun intended ) to use the same function name that MATLAB used. Of course these days weâ€™re using python and itâ€™s the same there: `np.log`

is natural log and `np.log10`

is base 10 log.