Understanding calculation of p-value

Week 4- Video (p-value)

Please see the attached snapshot. I dont understand how this orange area is calculated to be equal to 0.0332? Can someone explain me the steps to calculate it. This is the snapshot for right tailed test, i need the same help for left tailed test and two tailed test.

Thanks in advance.

It’s been some time, so I asked ChatGPT

What is the probability of getting a sample mean above 68.442 if the true population mean is 66.7 and the values sampled follow a normal distribution around 66.7 with std. dev. 3. We collected 10 samples.

ChatGPT answered:

We need to calculate the probability that the sample mean \bar{X} is greater than 68.442, given:

  • Population mean: \mu = 66.7
  • Population standard deviation: σ=3
  • Sample size: n = 10

(We need to find out how the sample mean is distributed, we know it is normally distributed, that’s one thing)

Step 1: Compute the Standard Error of the Mean

Since we are dealing with the sample mean, the standard deviation of the sample mean (Standard Error, SE) is:

SE = \frac{\sigma}{\sqrt{n}} = \frac{3}{\sqrt{10}} \approx \frac{3}{3.162} \approx 0.9487

Step 2: Compute the Z-score

The Z-score for \bar{X} = 68.442 is:

\frac{\bar{X} - \mu}{SE} = \frac{68.442 - 66.7}{0.9487} = \frac{1.742}{0.9487} \approx 1.837

(the Z-score expresses how many standard deviations a particular data point is away from the mean of a normal distribution; this nice trick works for the normal distribution only)

Step 3: Find the Probability

Using a standard normal table or calculator:

P(Z < 1.837) \approx 0.9666

Since we need \bar{X} > 68.442, we take the complement:

P(Z > 1.837) = 1 - 0.9666 = 0.0334

Final Answer:

P(\bar{X} > 68.442) \approx 0.0334

So, the probability is 3.34%. This means that if the true population > mean is 66.7, there is only a 3.34% chance that a random sample of 10 values would yield a sample mean above 68.442.

That sounds right.

For left-tailed, it’s p instead of 1-p, for two-tailed, it’s 2*(1-p).

Dear @shaiqbashir ,

Now you got the correct way to calculate, here is a remind of the underlying concept of the p-value which might no be intuitive/natural at the begining. But when your brain is trained to this concept, then it becomes much easier.

So you know that average height is 66.7. This is a well know stuff we can say. Problem is, one day, you decide to make some observations, and you find out that average is 68.442.

Question: is this normal or not? And here you need the p-value because:

The p-value tells you how “normal” or expected your data observation (68.442) is if your initial assumption (the null hypothesis of 66.7) is true.

So, a high p-value means the data is normal — it looks like what you’d expect under the null hypothesis of 66.7. That gives you no strong reason to reject the null hypothesis. In fact, it suggests your initial belief could be perfectly fine (even though it doesn’t prove it’s true).

Now, what a p-value really means visually, assuming the random variable (the height here) follows a normal distribution.

p-values are linked to the tails. Why? Because the tails represent the extreme values that are far from the average.

So after drawing the normal curve based on your null hypothesis (average is 66.7), then you can observe where your observation of an average of 68.442 stands:

  • your data point (>68.442) is close to the center of the distribution → This data is pretty normal under the null hypothesis → Large p-value
  • your data point (>68.442) is in the tails (extreme/unusual) → This data is rare → Small p-value

Actually, instead of observing with you eye, you calculate the p-value and you set the a significance level of 5%.

  • If p-value is more than 5%, it means is it large so to get 68.442 as observation is normal for an assumption of 66.7. So no strong reason to reject the null hypothesis (66.7): we keep 66.7.
  • If p-value is less than 5%, it means is it small so to get 68.442 as observation is extreme/unusual for an assumption of 66.7. So we reject the null hypothesis (66.7)

Bonus: why the default assumption (66.7) hypothesis is called “null” in statistics?
The word “null” comes from Latin nullus, meaning “none” or “zero.”
And that fits perfectly, because the null hypothesis usually says there’s no effect, no difference, or no relationship (for ex: no effect of a drug, no change over time…). It’s the baseline, a kind of “zero-effect” model we try to disprove. The idea is that “nothing interesting is happening.”

Hope it helps… Keep learning.

Learning new things is good for your health — both mental and even physical. This is proven scientiically!

Kader.

1 Like