Hello everyone,
I’m currently facing a challenge with the C3_W1_Assignment.
My main issue revolves around this particular line of code, which is causing me some trouble:
treatment_model, best_hyperparam_treat = holdout_grid_search(RandomForestClassifier, X_treat_train, y_treat_train, X_treat_val, y_treat_val, hyperparams)
The problem is that the X_treat_train
dataset no longer contains the TRTMT
column, as it was removed by the treatment_dataset_split
function.
However, the c_statistic
function (the estimator_score), which is a part of my holdout_grid_search
function, requires the TRTMT
column to identify the best model.
Here are the parameters needed for the function c_statistic
:
c_statistic(pred_rr, y, w, random_seed=0)
where w
is an array of true treatments.
The conundrum deepens because, even if X_treat_train
included the TRTMT
column, filled entirely with ones, it wouldn’t be feasible to split it into two arrays within the c_statistic
function - one with TRTMT==1
and another with TRTMT==0
- since we only have TRTMT==1
data.
Consequently, the c_for_benefit_score
function, which is a component of c_statistic
, fails to function as intended.
This has left me quite perplexed, and I would greatly appreciate any guidance or assistance on this matter.
Best regards,
Alex