How many people would you need in a group before you could be confident that at least one pair in the group share the same birthday?
One day, back in Smurfit Business School, our statistics lecturer challenged us to a bet. He predicted, confidently (smugly even), that at least two of us shared a birthday. He bet us each the princely sum of €1. I glanced around me and I counted close to 40 students in the room. Being the savant that I am, I also know there are approximately 365 days in a year, and so I thought, you’re on! I mean, even allowing for some probability magic: 40 people, 365 days, this is free money!
I soon learned this was the famous birthday problem and although I was beginning to feel cocky as we got half way through my classmates’ birthdays, our teacher ultimately prevailed. It turns out that in a group of just 23 people the probability of a matching pair of birthdays is over 50%!
I hope this spreadsheet and the explanation below will help you understand why this is so.
- We need at least 2 people to have any chance of having a matching pair. This is trivial. Person A has a birthday on any day. The probability of Person B matching is 1/365.
- With 3 people, there are three possible matches: A matches B, A matches C or B matches C.
- With 4 people there are 6 possible combinations (count the edges in the little diagram shown here). You might spot a pattern by now. In mathematics these are known as combinations. After a while counting manually becomes tedious but, thankfully, for any given number of people we can use the combination formula to see how many possible combinations exist – jump to column B in the spreadsheet for a closer look.
- The probability for any one of these combinations being a matching pair is 1/365. Think of that like a bet: each individual combination is a bet with a 1/365 chance of winning. How many of these bets would we have to place to get at least one win.
- Here’s a neat little probability trick for answering an “at least” type question. Compute the probability of not winning at all, i.e. precisely zero wins, and subtract that value from 1.*
- Column C in the spreadsheet uses the binomial distribution formula to compute the probability of a specific number of wins from a given number of bets where each bet is independent and has an equal probability of success.
- In our case we want to compute the probability of precisely zero wins and subtract this value from 1. This gives us the probability of at least one win.
In the results, we can see that 23 is the magic number where the probability of at least one match exceeds 0.5. Remember there were close to 40 in my class so my teacher knew at a glance that his probability of finding at least one pair was close to 0.9 … and there were enough suckers in the room to cover his lunch!
* This little problem inversion trick can be generalized further to any occasion when we are faced with a difficult question. If you’re struggling, try inverting the question. Having difficulty predicting fraud? Maybe try predicting “not fraud”! It sounds trivial, silly even, but inverting a problem can get you out of a mental rut. For a famous example, see how statistician Abraham Wald used this technique to help the Allies win WW2.