Seemingly incorrect analytical solution to Problem 4 of Birthday Lab

Hi guys. I’ve been wrapping my head around the analytical solution to Problem 4 in the Birthday Problem Lab. I wrote a simple Python script to verify it and the analytical solution seems incorrect.

First of all, the lab says " given two classrooms with n students, what is the value of n such that the probability of having a match is greater than or equal to 0.5 for any two students in each classroom?" Does it mean there are n students in each classroom or the total number in the two classroom is n? My understanding is the former.

For simplicity let’s assume there are 2 people in each classroom and there are 5 days in a year instead of 365. The following script shows that the correct probability is 0.584, while the analytical solution in the lab gives 0.5904.

I also have doubts regarding the fourth problem but with a different view.

Let’s say there are two classrooms A and B, each with n students. As the prompt doesn’t specify that students in classroom A can’t have a match among them, I don’t understand why we have to calculate (1-1/365)^n.

My approach:
In order to let students in B classroom to have a match with students from A, supposing the n students take n slots in 365 days, each student from B has probability of (n/365) to have a match with A students. We can have the probability that every B student has a match with A students, which is (n/365)^n. Therefore, the final answer to “find a match between a student in one classroom and a student in the other” should be 1-(n/365)^n.

Please let me know if I go wrong anywhere.

Hi Jason, I guess you mean you don’t understand why we have to calculate (1-1/365)^(n^2) in your last sentence instead of (1-1/365)^n. I don’t think the solution is correct but I can explain my understanding below.

First of all, my assumption is that there are n students in each classroom. The idea of the solution of the lab is that for each person in group A, the probability that no student in group B (there are n students in it) matches with him/her is (1 - 1/365) ^ n ( this n is the number of students in B). Please note that this is exactly the idea of Problem 1.

Now, because there are n students in group A, the solution claims that we have to repeat the above process n times to produce the probability of no match between A and B. The solution suggests we have to calculate (1 - 1/365) ^ n * (1 - 1/365) ^ n * (1 - 1/365) ^ n … * (1 - 1/365) ^ n (n multipliers) = (1 - 1/365)^(n^2). (This is where I believe it goes wrong, because the calculation for student 1 in group A is not independent from the calculation for student 2)

The solution continues to state that applying the complement rule we get the probability of at least one match between A and B is 1 - (1 - 1/365)^(n^2).

My approach:
In order to let students in B classroom to have a match with students from A, supposing the n students take n slots in 365 days, each student from B has probability of (n/365) to have a match with A students. We can have the probability that every B student has a match with A students, which is (n/365)^n. Therefore, the final answer to “find a match between a student in one classroom and a student in the other” should be 1-(n/365)^n.

I guess you assume that any two students in group A don’t have the same birthday and this also applies to B. This assumption makes sense and clarifies Problem 4, although there seems to be an error in your subsequent calculation (correct me if I’m wrong).
My calculation is: the probability that student 1 in group B doesn’t match with anyone in group A is (1 - n/365), the probability that student 2 in group B doesn’t match with anyone in group A is also (1 - n/365), and so on. So the probability that there is no match between A and B would be (1 - n/365)^n. Applying the complement rule the probability that there is at least one match between A and B is 1 - (1 - n/365)^n.

Hey pang,

Thanks for pointing out. You are right, and the answer to the problem based on my understanding should indeed be 1 - (1 - n/365)^n.

But now I actually have doubts on my own version, because I don’t think n students in A could just take n different dates in a year. In opposite to your argument, I think each student in A, or to be more accurate, the events that B students have a match with any of the students in A are actually independent from each other.

Let’s say the probability of students named B1…Bn having a match with A1 is P. The probability of students B1…Bn having a match with A2 should also be P, no matter A1 or A2 have the same date or not. The date of A1 will have no influence on the date of A2, as we don’t care if there are matches among A students themselves or not.

Please advise, peers and mentors.