I can not understand what I have to do.
This error means that your implementation of calculating the information gain is not right, what you have to do is go back and redo the function or exercise…
First of all if I did any mistake then how expected example satisfy with my result. However here is my code that I have written. Please take a look and help me to figure out the solution.
# UNQ_C3
# GRADED FUNCTION: compute_information_gain
def compute_information_gain(X, y, node_indices, feature):
"""
Compute the information of splitting the node on a given feature
Args:
X (ndarray): Data matrix of shape(n_samples, n_features)
y (array like): list or ndarray with n_samples containing the target variable
node_indices (ndarray): List containing the active indices. I.e, the samples being considered in this step.
Returns:
cost (float): Cost computed
"""
# Split dataset
left_indices, right_indices = split_dataset(X, node_indices, feature)
# print("left_indices:", left_indices, "right_indices:" , right_indices, "X ->", X, "feature->", feature)
# Some useful variables
X_node, y_node = X[node_indices], y[node_indices]
X_left, y_left = X[left_indices], y[left_indices]
X_right, y_right = X[right_indices], y[right_indices]
# You need to return the following variables correctly
information_gain = 0
### START CODE HERE ###
# Moderator edit: code removed
### END CODE HERE ###
return information_gain
Please don’t post your code on the forum. The Code of Conduct does not allow it.
If a mentor needs to see your code, we’ll contact you with instructions.
The other possibility is maybe your compute_entropy() or split_dataset() code doesn’t work correctly.
Passing the test cases built into the notebook does not prove your code is perfect.
The data set used by compute_information_gain_test() is different. So that may expose defects in your other functions.
i have the same issue if it have been resolved for you how can i resolve it
I had faced similar problem. The issue is that compute_entropy function needs to cover all scenarios such as if the dataset is completely pure etc.
The compute_information_gain function uses compute_entropy, and if compute_entropy doesn’t have all edge cases covered, then compute_information_gain will also fail.
Regards
Nagesh
Can’t paste my code here :).