# Information gain calculation problem

I can not understand what I have to do.

This error means that your implementation of calculating the information gain is not right, what you have to do is go back and redo the function or exerciseâ€¦

1 Like

First of all if I did any mistake then how expected example satisfy with my result. However here is my code that I have written. Please take a look and help me to figure out the solution.

# UNQ_C3

def compute_information_gain(X, y, node_indices, feature):

"""
Compute the information of splitting the node on a given feature

Args:
X (ndarray):            Data matrix of shape(n_samples, n_features)
y (array like):         list or ndarray with n_samples containing the target variable
node_indices (ndarray): List containing the active indices. I.e, the samples being considered in this step.

Returns:
cost (float):        Cost computed

"""
# Split dataset
left_indices, right_indices = split_dataset(X, node_indices, feature)

#     print("left_indices:", left_indices, "right_indices:" , right_indices, "X ->", X, "feature->", feature)

# Some useful variables
X_node, y_node = X[node_indices], y[node_indices]
X_left, y_left = X[left_indices], y[left_indices]
X_right, y_right = X[right_indices], y[right_indices]

# You need to return the following variables correctly
information_gain = 0

### START CODE HERE ###

# Moderator edit: code removed

### END CODE HERE ###

return information_gain

Please donâ€™t post your code on the forum. The Code of Conduct does not allow it.
If a mentor needs to see your code, weâ€™ll contact you with instructions.

The other possibility is maybe your compute_entropy() or split_dataset() code doesnâ€™t work correctly.

Passing the test cases built into the notebook does not prove your code is perfect.
The data set used by compute_information_gain_test() is different. So that may expose defects in your other functions.

1 Like

i have the same issue if it have been resolved for you how can i resolve it

I had faced similar problem. The issue is that compute_entropy function needs to cover all scenarios such as if the dataset is completely pure etc.

The compute_information_gain function uses compute_entropy, and if compute_entropy doesnâ€™t have all edge cases covered, then compute_information_gain will also fail.

Regards
Nagesh

Canâ€™t paste my code here :).

1 Like