Information Gain confusion

Hello! I hope you are doing well.
I have some confusion about the info gain in a week 4 assignment.

  1. We have set the value of max_info_gain = 0. I am confused why.
  2. In a hint section it is said that if the information gain is greater than the maximum information gain, then set the maximum information gain equal to the information gain. But we saw in Exercise 3 (compute_information_gain) part that information_gain for all three features is greater than 0. Then how did it decide to choose feature 2 as the best split? Feature 0 and Feature 1 also have information_gain greater than 0 (but less than feature 2’s value).
    Kindly guide me. I will be extremely thankful to you.

Regards,
Saif Ur Rehman.

Capture

Hello Saif Ur Rehman,

Thank you for your question! I suggest we change our attention to the following few lines of code and try to understand a very common trick in programming when we want to find the maximum (or minimum) value without using a max (or min) function.

candidates = [ 1, 3, 9, 4, 6, 2,  ]

max_value = -1

for value in candidates:
    if value > max_value:
        max_value = value

Would you be able to understand what this script is doing by “running” it in your mind? And find answers of the following questions:

  1. What will be the final max_value?
  2. If we change the line max_value = -1 into max_value = 100, what will the final max_value become then?
  3. Is there any other good value to initialize max_value besides -1?
  4. What would you do if you want to find the minimum value instead of the maximum value?

If you find yourself fully understand these questions and your original question in this thread, then it’s fine and you actually don’t need to answer me those, but just let me know that you understand it so we can close this topic.

However, if you still have the questions in your original post, please share your answer with me for the above Q1-3 (excluding 4), so that we can continue the discussion based on your understanding which I think would be the best starting point.

Raymond

1 Like

Hello Raymond! Thanks for such a detailed conversation. My answers are:

  1. max_value will be 9
  2. If max_value = 100, then the final value will be 100 because the condition does not meet
  3. I got your point. Setting to -1 covers all positive numbers.

Thanks for helping me with this. I got the 2nd Answer of my post. But we cannot set max_info_gain = -1, instead of 0? I think we can, right?

Hello Saif Ur Rehman,

I can’t open the lab at this moment to verify but it might be the best to keep it initialize to zero for this particular exercise, because we care only positive gain and exclude zero gain. If we set it to -1, then even a zero gain will satisfy the comparison but it might not be desired. Only if we set it to zero will we make sure only a positive gain matters.

You might want to try -1 and examine the code to see what to expect. It’s important that the code works to your expectation even though the grader might not pass it.

Raymond

1 Like

I got it. Thanks a million, Raymond.

You are welcome Saif!

Raymond