I was just curious if it is better to build the decision tree using while
loop since we are using the term recursively to make a more generalized function rather than calling it three times in the implementation.
I appreciate if some can help me out with my pseudo-code:
def build_dt(X, y, root_node_indices , max_depth=2, ):
# take care of the root_node outside while loop (initialization step) (depth = 0 perhaps???)
best_feature_root = get_best_split(X, y, root_node_indices)
left_indices, right_indices = split_dataset(X, root_node_indices, best_feature_root)
X_left, y_left = X[:, left_indices], y[left_indices]
X_right, y_right = X[:, right_indices], y[right_indices]
current_depth = 0
while current_depth <= max_depth:
best_feature_left = get_best_split(X_left, y_left, left_indices)
best_feature_right = get_best_split(X_right, y_right, left_right)
left_indices, _ = split_dataset(X_left, left_indices, best_feature_left)
_, right_indices = split_dataset(X_right, right_indices, best_feature_right)
X_left, y_left = X[:, left_indices], y[left_indices]
X_right, y_right = X[:, right_indices], y[right_indices]
current_depth =+ 1
Is this a correct implementation, or should we logically also fetch the root_node
inside the while
loop if we consider max_depth=2
.
Cheers,