Calculation of partial derivative of the cost function for logistic regression

I don’t know the material in MLS. This does get mentioned in DLS C1 Week 2, although you have to follow an offered link to a Discourse thread that covers it.

As Wendy says, Prof Ng is focussed on new material, so it’s not likely that he would revise the lectures to mention this in MLS. But perhaps they could just add a Reading Item to the appropriate week of MLS similar to this one in DLS C1 W2:

1 Like

Hi @ai_is_cool,

Well spotted! However, in the machine learning community it is common practice to assume that \log x = \ln x = \log_e x.

Hi @conscell,

I think it would be useful to make it clear in Prof. Ng’s course that this is a convention in ML circles before presenting it for this first time.

@ai_is_cool I fear I am much less ‘pure maths’ enabled as the other guys, but I highly suspect this comes from Claude Shannon’s classic works on Information Theory (from which arrives the ‘entropy loss’ concept). I am still learning but him and William Weaver’s volume is, I think, worth revisiting.

1 Like

Hi @nevermind,

Thanks for your reply however it’s not clear to me how your reply addresses my issue.

Also, I don’t recall William Weaver being mentioned by Prof. Ng so far in Course 1.

I usually like to provide (site) agonistic links to texts but in this case I can’t find one. But it is Shannon’s ‘The Mathematical Theory of Communication’. I am not sure of William Weaver’s role, and I am not surprised Prof. Ng’s lack of mention of this text.

A modern Economist class mentions supply/demand-- But they don’t require to read about the ‘butcher, brewer and breadmaker’ of Adam Smith (or Marx, on ‘labour-power’, for that matter). It is kind of ‘assumed’, though I read both.

In any case, I thought you’d really like this text. As far as I know this gives you your outline of where ‘cross-entropy’ (i.e. log loss) ‘comes from’, and there is plenty of mathematical detail there, thus I felt it should help.

Thanks but I am still having difficulty understanding how your contributions to this thread address my logarithm issue in Prof Ng’s video lesson.

1 Like

hi @ai_is_cool

Some part of mathematics is not explicitly mentioned by Prof.Ng probably he understand people would understand the differentiation of natural lgorithm and common logarithm

where derivative of ln(x)=1/x
derivative of log(x)=1/x ln (10)

I don’t know remember in MLS he mentions or not, but we surely forward your feedback to staff. Please remember that deep learning and machine! learning assumes, some basic math understanding on the learner part. But surely not defending by stating this as I can understand it can confuse some. whenever I had such issue, I used to look for more explanation and answer even outside the course.

Really appreciate your deep insight to look into machine learning from mathematical perspective. Discussion are good way to improving our knowledge.

Hope my hand writing and explanation isnt confusing in the pic :crazy_face:

Regards
DP

1 Like

Thanks DP for your explanation and working out by hand.

However, I think the point I was trying to make is that the common logarithm log(x) is to the base-10 which is the logarithm Prof. Ng uses in his video lesson. However, others on here have said that what he actually means is the natural logarithm ln(x), which is confusing for a beginner student seeing his expression for the first time without an explanation to say that the natural logarithm is meant here and not the common logarithm.

as I said earlier some part of mathematics, Prof.Ng probably understand people would understand

that when he is using ln(x) the derivatives gives the same for log(x) having the understanding between natural logrithm and common logarithm.

The probable reason behind using natural logrithm is because mathematical properties of the natural logarithm gives simple and more elegant solutions when dealing with exponential relationships and derivative calculations, especially in the context of optimization algorithm.

So base e aligns more with natural algorithm and inherent exponential.

Remember as I said earlier these part of derivative calculation was explained more in detail DLS specialisation, which I am sure will tickle your mind more in case you will be taking deep learning specialisation.

Your suggestions are appreciated and will be notified to concerned staff to atleast add some basic explanation about this into the course.

Regards
DP

Hi DP,

I’m not really understanding what you are saying.

The partial derivatives are not the same between using log(x) and ln(x) if by log(x) you mean log_{10}(x).

You are again missing the point, where prof.ng is using natural logarithm instead of common logrithm as the derivative of both tend to be similar.

Last thing I’m going to say on this matter:

Andrew uses log() to indicate the natural log.
All of machine learning for classification uses natural logs.
Machine learning notation does not use ln() for the natural log.

Insisting that Andrew is wrong will not change these facts.

1 Like

Hi @TMosh,

Thank you for your reply in this matter.

I’m sorry to hear that you think I am “…insisting that Andrew is wrong…”.

Nothing could be further from the truth. I just think Andrew could have made it a bit clearer in his video presentations that when he uses the terminology of the common logarithm log(x), he actually means the natural logarithm ln(x).

It can be a little confusing for people like me coming into machine learning for the first time and seeing log(x) expressions and thinking “…ok, so this is a base-10 logarithm…” when in fact Andrew means a base-e logarithm.

I just think it would enhance the educational value of his presentations if he had made a comment about how equations using the common logarithm nomenclature log(x) in machine learning actually means he is using the natural logarithm here.

Best wishes,

Stephen.

Hi @Deepti_Prasad,

Please explain what the point is that I am “missing” again.

Thanks for your reply.

The partial derivatives are similar but not the same owing to the factor \frac {1}{ln(10)}.

i explained the same in the hand written explanation in my previous response, why we could use ln(x) or log(x) for the partial derivative.

Also log (x) = 1/x ln(10 and not 1/ln(10)

Sorry but I’m still a little confused over the point you are trying to make.

Perhaps you can try re-phrasing your point?

@ai_is_cool

ln(x) is the natural logarithm, meaning it has a base of “e”: This is denoted by “ln(x)” = log base e (x)

log(x) usually refers to the common logarithm, meaning it has a base of “10”: This is denoted by “log(x)” = log base 10 (x).

Conversion between ln and log

  1. To convert “ln(x)” to “log(x)”: Use the change of base formula:
    ln(x) = log(x) / log(e)

  2. To convert “log(x)” to “ln(x)”: Use the change of base formula:
    log(x) = ln(x) / ln(10)

When to use ln(x) instead of log(x)

  1. Usually when there is natural connection with the exponential function “e^x”, using “ln(x)” often simplifies calculations and leads to cleaner result.
  2. if a problem clearly mentions a base of “e”, then use “ln(x)”

remember also log(x) is not always base 10, as in general it could be any rational number other than 1 like in my hand written I mention it as a and in @conscell comment it is mentioned as e as loge(x)

All,
I think we can be done with this topic now. I submitted a request to staff to add a note in the first lab where log is used and they’ve already made the change. There is now a nice bullet point in Optional Lab: Logistic Loss that explains that the notational convention is that log means the natural log.

3 Likes