The Birthday Paradox: an interesting probability problem involving “statistically independent” events

Following up on my previous blog posting entitled “Statistical Independence,” consider the so-called “Birthday Paradox”. The Birthday Paradox pertains to the probability that in a set of randomly chosen people, some pair of them will have the same birthday. Counter-intuitively, in a group of 23 randomly chosen people, there is slightly more than a 50% probability that some pair of them will both have been born on the same day.

To compute the probability that two people in a group of n people have the same birthday, we disregard variations in the distribution, such as leap years, twins, seasonal or weekday variations, and assume that the 365 possible birthdays are equally likely.[1] Thus, we assume that birth dates are statistically independent events. Consequently, the probability of two randomly chosen people not sharing the same birthday is 364/365. According to the combinatorial equation, the number of unique pairs in a group of n people is n!/2!(n-2)! = n(n-1)/2. Assuming a uniform distribution (i.e., that all dates are equally probable), this means that the probability that no pair in a group of n people shares the same birthday is equal to p(n) = (364/365)^[n(n-1)/2]. The event of at least two of the n persons having the same birthday is complementary to all n birthdays being different. Therefore, its probability is p’(n) = 1 – (364/365)^[n(n-1)/2].

Given the assumptions listed in the previous paragraph, suppose that we are interested in determining how many randomly chosen people are needed in order for there to be a 50% probability that at least two persons share the same birthday. In other words, we are interested in finding the value of n which causes p(n) to equal 0.50. Therefore, 0.50 = (364/365)^[n(n-1)/2]; taking natural logs of both sides and rearranging, we obtain (ln 0.50)/(ln 364/365) = n(n-1)/2. Solving for n, we obtain 505.304 = n(n -1); therefore, n is approximately equal to 23.[2]

The following graph illustrates how the probability that a pair of people share the same birthday varies as the number of people in the sample increases:

New Picture (1)

[1] It is worthwhile noting that real-life birthday distributions are not uniform since not all dates are equally likely. For example, in the northern hemisphere, many children are born in the summer, especially during the months of August and September. In the United States, many children are conceived around the holidays of Christmas and New Year’s Day. Also, because hospitals rarely schedule C-sections and induced labor on the weekend, more Americans are born on Mondays and Tuesdays than on weekends; where many of the people share a birth year (e.g., a class in a school), this creates a tendency toward particular dates. Both of these factors tend to increase the chance of identical birth dates, since a denser subset has more possible pairs (in the extreme case when everyone was born on three days of the week, there would obviously be many identical birthdays!).

[2]Note that since 33 students are enrolled in Finance 4335 this semester, this implies that the probability that two Finance 4335 students share the same birthday is roughly p’(33) = 1 – (364/365)^[33(32)/2] = 76.5%.

Statistical Independence

During yesterday’s Finance 4335 class meeting, I introduced the concept of statistical independence. During tomorrow’s class meeting, much of our class discussion will focus on the implications of statistical independence for probability distributions such as the binomial and normal distributions which we will rely upon throughout the semester.

Whenever risks are statistically independent of each other, this implies that they are uncorrelated; i.e., random variations in one variable are not meaningfully related to random variations in another. For example, auto accident risks are largely uncorrelated random variables; just because I happen to get into a car accident, this does not make it any more likely that you will suffer a similar fate (that is unless we happen to run into each other!). Another example of statistical independence is a sequence of coin tosses. Just because a coin toss comes up “heads,” this does not make it any more likely that subsequent coin tosses will also come up “heads.”

Computationally, the joint probability that we both get into car accidents or heads comes up on two consecutive tosses of a coin is equal to the product of the two event probabilities. Suppose your probability of getting into an auto accident during 2017 is 1%, whereas my probability is 2%. Then the likelihood that we both get into auto accidents during 2017 is .01 x .02 = .0002, or .02% (1/50th of 1 percent). Similarly, when tossing a “fair” coin, the probability of observing two “heads” in a row is .5 x .5 = 25%. The probability rule which emerges from these examples can be generalized as follows:

Suppose Xi and Xj are uncorrelated random variables with probabilities pi and pj respectively. Then the joint probability that both Xi and Xj occur is equal to pipj.

Grade Listings on Canvas…

Although we are only 3 class meetings into the Fall 2017 semester, I have begun posting Finance 4335 grades to Canvas. If you login to Canvas and look at your grades, you’ll see entries for “Attendance”, “Quizzes”, and “Problem Sets”. The grades listed there are current, up to the grades for the first two quizzes and the first problem set, and reflect the average values in each category.

As I explain in my “Class participation in Finance 4335” blog posting, your class participation grade consists of three components: 1) class attendance, 2) participation in class (by being actively engaged; e.g., asking questions, offering insights, what have you) and 3) participation outside of class (by contributing via social media (Twitter/blog)). Between now and the last day of class, only the attendance component of the Finance 4335 participation grade will be reported on Canvas. Since all grades are reported according to a [0,100] scale, the attendance score indicates the percent of classes that you have attended as of a given point in time.

Going forward, I will try my best to keep the course grade-book up-to-date as the semester progresses. Depending upon your subsequent attendance and performance on quizzes and problem sets, these items will probably change over time. Furthermore, new grade categories will eventually be included; specifically, your two midterm and final exam grades. Once all the data have been collected (i.e., some time after the final exam for Finance 4335 has been graded), I will assign a final participation grade as well as a final course numeric grade. The equation for determining the final course numeric grade and the schedule for assigning final course letter grades (based upon final course numeric grades) appear in the Grade Determination section of the course syllabus.

Problem Set 1

Hello Class,

Overall, well done on the problem set. The majority of you have a very firm grasp of the subject. For those of you that struggled, I would once again suggest studying how to take derivatives and the implications of derivatives very carefully should you wish to do well in the course. Additionally, even if you know the material well, be sure to carefully check your work for any intermediate calculation errors, or for incorrectly reading the question and hence not appropriately answering the question.

Best Wishes,

Alexander

Two upcoming extra credit opportunities for Finance 4335

Here are a couple of upcoming extra credit opportunities for Finance 4335. You may earn extra credit by attending and reporting on 1) “The Triumph of the Entrepreneurial Spirit” (scheduled for 4-5:15 in Foster 240 on Thursday, September 14, and 2) “Economic Inequality: Popular Misconceptions and Important Facts” (scheduled for Thursday, October 19 scheduled for 4-5:15 in Foster 240 on Thursday, October 19. In order to receive extra credit for either or both of these talks, you must submit (via email sent to risk@garven.com) a 1-2 page executive summary of what you learn from each talk . The executive summary for the September 14th talk is due by no later than 5 p.m. on Monday, September 18th and the executive summary for the October 19th talk is due by no later than 5 p.m. on Monday, October 23rd. In both cases, the extra credit will replace your lowest quiz grade in Finance 4335 (assuming the extra credit grade is higher).

Catastrophe Bonds Fall as Hurricane Harvey Bears Down on Texas

Good article from Bloomberg on how catastrophe (AKA “cat”) bonds are a unique asset class for investors and how such bonds disrupt traditional reinsurance markets.  For a broader perspective of these topics, also see the August 2016 WSJ article entitled “The Insurance Industry Has Been Turned Upside Down by Catastrophe Bonds” and my blog posting entitled “Cat Bonds“.

Cat bonds represent a form of securitization in which risk is transferred to investors rather than insurers or reinsurers. Typically, an insurer or reinsurer will issue a cat bond to investors such as life insurers, hedge funds and pension funds. The bonds are structured similarly to traditional bonds, with an important exception: if a pre-specified event such as a hurricane occurs prior to the maturity of the bonds, then investors risk losing accrued interest and/or the principal value of the bonds. This is why these bonds are falling in price – investors expect that the payment triggers tied to storms like #Harvey will reduce the payments received by holders of these bonds.

Bonds tied to weather risks tumbled the most in seven months as Hurricane Harvey advances on Texas’s Gulf Coast.

Place Your Bets: When Will the U.S. Hit the Debt Ceiling?

This is an excellent article on how asset prices impound political risks, and the role of so-called “prediction markets” in assessing political event probabilities (in this case, the likelihood of the U.S. defaulting on its debt).

Prediction markets add a crowdsourced opinion to the chaos of Washington.

Finance 4335