Hi, here is how you can debug it yourself.

  1. Click “File” > “Open” > “” > go to function “compute_information_gain_test”, then you will see the following test which expects an information gain of 0.311278
    node_indexes = list(range(4))
    result = target(X, y, node_indexes, 0)
    assert np.isclose(result, 0.311278, atol=1e-6), f"Wrong information gain. Expected {0.311278} got: {result}"
  1. There you see it gets only 5 samples involved
    X = np.array([[1, 0], 
         [1, 0], 
         [1, 0], 
         [0, 0], 
         [0, 1]])
    y = np.array([[0, 1, 0, 1, 0]]).T
  1. Review the code provided in the exercise
    # Split dataset
    left_indices, right_indices = split_dataset(X, node_indices, feature)
    # Some useful variables
    X_node, y_node = X[node_indices], y[node_indices]
    X_left, y_left = X[left_indices], y[left_indices]
    X_right, y_right = X[right_indices], y[right_indices]
    # You need to return the following variables correctly
    information_gain = 0
  1. Open a new code cell, copy the test data, and the provided code and run the data against the provided code.

  2. refer to the exercise description, you need 3 entropy values to calculate the information gain, calculate them using your compute_entropy function. If you are not sure your compute_entropy is correct, please verify it by re-calculating them with the formula under section 4.1.

  3. With the entropy values, calculate the information gain with your code work in the exercise, and calculate the information gain with the equation under section 4.3, and verify they are consistent and equal to the expected value of 0.311278. They should be different because you are having the error, but this process should give you hints on where the bug lies. Once you successfully debug it, copy the working code back to your exercise function and try the tests again.

  4. Once you pass all the test, remove the code cell you created for this debugging work.

Good luck.