Hi, here is how you can debug it yourself.
- Click “File” > “Open” > “public_tests.py” > go to function “compute_information_gain_test”, then you will see the following test which expects an information gain of 0.311278
node_indexes = list(range(4))
result = target(X, y, node_indexes, 0)
assert np.isclose(result, 0.311278, atol=1e-6), f"Wrong information gain. Expected {0.311278} got: {result}"
- There you see it gets only 5 samples involved
X = np.array([[1, 0],
[1, 0],
[1, 0],
[0, 0],
[0, 1]])
y = np.array([[0, 1, 0, 1, 0]]).T
- Review the code provided in the exercise
# Split dataset
left_indices, right_indices = split_dataset(X, node_indices, feature)
# Some useful variables
X_node, y_node = X[node_indices], y[node_indices]
X_left, y_left = X[left_indices], y[left_indices]
X_right, y_right = X[right_indices], y[right_indices]
# You need to return the following variables correctly
information_gain = 0
-
Open a new code cell, copy the test data, and the provided code and run the data against the provided code.
-
refer to the exercise description, you need 3 entropy values to calculate the information gain, calculate them using your compute_entropy
function. If you are not sure your compute_entropy
is correct, please verify it by re-calculating them with the formula under section 4.1.
-
With the entropy values, calculate the information gain with your code work in the exercise, and calculate the information gain with the equation under section 4.3, and verify they are consistent and equal to the expected value of 0.311278. They should be different because you are having the error, but this process should give you hints on where the bug lies. Once you successfully debug it, copy the working code back to your exercise function and try the tests again.
-
Once you pass all the test, remove the code cell you created for this debugging work.
Good luck.
Raymond