Week 2 Assignment Section 4 Confidence Interval 95% Clarification Please

Hello Course Support Staff,

I have completed Week 2’s assignment, but I am not understanding a couple of items related to section 4 on confidence intervals:

  1. Given that “A 95% confidence interval for an estimate ŝ of a parameter s is an interval I=(a,b) such that 95% of the time when the experiment is run, the true value s is contained in I”, why is util.print_confidence_intervals(class_labels, statistics) printing the Mean AUC with a CI from 5%-95% (“Mean AUC (CI 5%-95%)
    Cardiomegaly 0.93 (0.90-0.96)”)?

Given the definition above, I would have expected to see the estimated population AUC after bootstrap with a range (a,b) but with a confidence interval of 95%.

  1. Where exactly in the code is the range (a,b) being calculated for a confidence interval of 95%? I can see the 95% and 5% in the lines of code, but I am unable to interpret via reading the code below in util.py as to how it means that (min, max) is the interval with a 95% confidence interval:
def print_confidence_intervals(class_labels, statistics):
    df = pd.DataFrame(columns=["Mean AUC (CI 5%-95%)"])
    for i in range(len(class_labels)):
        mean = statistics.mean(axis=1)[i]
        max_ = np.quantile(statistics, .95, axis=1)[i]
        min_ = np.quantile(statistics, .05, axis=1)[i]
        df.loc[class_labels[i]] = ["%.2f (%.2f-%.2f)" % (mean, min_, max_)]
    return df

Will be grateful for some clarity please. Thanks,

-Ananth Krishnan

I don’t know the answer to question 1, but for question 2 that is what those calls to numpy quantile are doing, right? Here’s the docpage for that. You are taking the distribution as input and computing where the 5th percentile and the 95th percentile are.

Hello Paul - Thanks for the doc-page link. I had gone there as well earlier prior to posting my question but I am still not clear on how the print_confidence_intervals function provides the range (a,b) of the bootstrap-based-estimated population’s parameter with a confidence interval of 95%, as described in the detailed verbal descriptions of section 4. Thanks.

Perhaps it is showing 90% confidence interval, not 95%? Maybe that is the explanation?