Hello Course Support Staff,
I have completed Week 2’s assignment, but I am not understanding a couple of items related to section 4 on confidence intervals:
- Given that “A 95% confidence interval for an estimate ŝ of a parameter s is an interval I=(a,b) such that 95% of the time when the experiment is run, the true value s is contained in I”, why is util.print_confidence_intervals(class_labels, statistics) printing the Mean AUC with a CI from 5%-95% (“Mean AUC (CI 5%-95%)
Cardiomegaly 0.93 (0.90-0.96)”)?
Given the definition above, I would have expected to see the estimated population AUC after bootstrap with a range (a,b) but with a confidence interval of 95%.
- Where exactly in the code is the range (a,b) being calculated for a confidence interval of 95%? I can see the 95% and 5% in the lines of code, but I am unable to interpret via reading the code below in util.py as to how it means that (min, max) is the interval with a 95% confidence interval:
def print_confidence_intervals(class_labels, statistics):
df = pd.DataFrame(columns=["Mean AUC (CI 5%-95%)"])
for i in range(len(class_labels)):
mean = statistics.mean(axis=1)[i]
max_ = np.quantile(statistics, .95, axis=1)[i]
min_ = np.quantile(statistics, .05, axis=1)[i]
df.loc[class_labels[i]] = ["%.2f (%.2f-%.2f)" % (mean, min_, max_)]
return df
Will be grateful for some clarity please. Thanks,
-Ananth Krishnan