Week 2 Assignment Section 4 Confidence Interval 95% Clarification Please

HobbyLearner · December 13, 2023, 11:30pm

Hello Course Support Staff,

I have completed Week 2’s assignment, but I am not understanding a couple of items related to section 4 on confidence intervals:

Given that “A 95% confidence interval for an estimate ŝ of a parameter s is an interval I=(a,b) such that 95% of the time when the experiment is run, the true value s is contained in I”, why is util.print_confidence_intervals(class_labels, statistics) printing the Mean AUC with a CI from 5%-95% (“Mean AUC (CI 5%-95%)
Cardiomegaly 0.93 (0.90-0.96)”)?

Given the definition above, I would have expected to see the estimated population AUC after bootstrap with a range (a,b) but with a confidence interval of 95%.

Where exactly in the code is the range (a,b) being calculated for a confidence interval of 95%? I can see the 95% and 5% in the lines of code, but I am unable to interpret via reading the code below in util.py as to how it means that (min, max) is the interval with a 95% confidence interval:

def print_confidence_intervals(class_labels, statistics):
    df = pd.DataFrame(columns=["Mean AUC (CI 5%-95%)"])
    for i in range(len(class_labels)):
        mean = statistics.mean(axis=1)[i]
        max_ = np.quantile(statistics, .95, axis=1)[i]
        min_ = np.quantile(statistics, .05, axis=1)[i]
        df.loc[class_labels[i]] = ["%.2f (%.2f-%.2f)" % (mean, min_, max_)]
    return df

Will be grateful for some clarity please. Thanks,

-Ananth Krishnan

paulinpaloalto · December 13, 2023, 11:57pm

I don’t know the answer to question 1, but for question 2 that is what those calls to numpy quantile are doing, right? Here’s the docpage for that. You are taking the distribution as input and computing where the 5th percentile and the 95th percentile are.

HobbyLearner · December 14, 2023, 12:08am

Hello Paul - Thanks for the doc-page link. I had gone there as well earlier prior to posting my question but I am still not clear on how the print_confidence_intervals function provides the range (a,b) of the bootstrap-based-estimated population’s parameter with a confidence interval of 95%, as described in the detailed verbal descriptions of section 4. Thanks.

Perhaps it is showing 90% confidence interval, not 95%? Maybe that is the explanation?

Topic		Replies	Views
Bootstrap methods AI for Medical Diagnosis week-2	1	523	February 21, 2023
Interval Estimation Question Probability & Statistics for Machine Learning &... week-4	8	333	October 19, 2023
Error in Week 4 Summative Quiz Probability & Statistics for Machine Learning &... general	3	29	July 26, 2024
C3_W4 p_value Probability & Statistics for Machine Learning &... week-4	10	287	April 1, 2024
Need clarification: Rideshare_Project_Week4 (notebook) Probability & Statistics for Machine Learning &... week-4	3	108	June 6, 2024

Week 2 Assignment Section 4 Confidence Interval 95% Clarification Please

Related topics