Probability

Learn how to model and predict the world around you – from card games to medicine.

What is the range of values that probability values can take?

0 to 1

What type of statistics is typically used when wanting to find a general trend in data that has already been collected?

Frequentist statistics

What is the name of the method used by frequentist statistics to estimate probability?

Maximum Likelihood Estimation (MLE)

What is Probability ?

Have you ever wanted to predict an event, like who will win the election, or how likely you can flip a coin and get heads three times in a row?

Probability helps you answer such questions. While many events can’t be predicted with complete certainty, when done correctly, probability methods enable us to predict the likelihood of events with enough reliability to inform our decision making.

Probability values always range from 0 to 1, where 0 is ‘never going to happen’ and 1 is ‘you bet, it will definitely happen’. In the middle, we have 0.5, which is an even chance of the event happening, or not happening.

When it comes to predicting events, probability is just a guide. For example, strictly speaking, if you flip a coin 10 times you should expect to get heads 5 of those flips. Go ahead and try it though, because in reality it doesn’t always work out this way. You might get 7 heads, or even 10.

Probability is used to make predictions in fields like weather forecasting, medicine, insurance, and more.

Introduction to frequentist probability

Have you ever wondered how likely it is that something will happen? Maybe you wondered how likely it was you could flip a coin and get heads 5 times in a row? It’s possible you were even trying to win a bet. Probability can help you with that.

The two main schools of probability thought are frequentist probability, and Bayesian probability. Here we will focus on the frequentist way of thinking.

Frequentist methods use large samples and statistical methods to generate probability distributions, and from those, make predictions about the probability of an event. For example, to find the probability of getting heads when flipping a coin, a frequentist would flip that coin 100000 times and count how many of them it landed on heads.

When you want to find a general trend in data that you’ve already collected, frequentist statistics typically produce more reliable results than bayesian statistics. Also, when you need to test multiple hypotheses at once – like determining whether there is enough evidence to support two competing hypotheses, then frequentist statistics is typically preferable.

Introduction to bayesian probability

Bayesian probability can be used to estimate the likelihood of an event based on your current knowledge and prior beliefs. As an example, if you know that people with red hair are more likely to be left-handed, then you can use the Bayes’ Theorem to estimate the likelihood of left-handedness given the observed data. Bayesian probability provides a framework for understanding how chance affects outcomes in complex systems.

As an example, when flipping a coin, a bayesian would say that we know that there are two possible outcomes, so the chance of getting heads is 50% – we don’t need to flip the coin 100000 times.

Bayesian probability actually goes much deeper than that, though, it involves forming prior assumptions based on known observations or estimates, and updating probability estimates based on new evidence.

For example, Bayesian probability tells you the chance that you have a disease, given that you tested positive for it. This takes into account evidence, such as the base rate of the disease in the population, as well as the test’s accuracy. Bayesian statistics is useful for things like rare diseases, where finding large samples for frequentist methods is just not possible.

Unions, Complements, and Intersections

Intersections are when two events occur together, and are typically denoted by the intersection symbol (with the common way of expressing an interaction’s probability below):

, and the expression for an intersection’s probability (below)”)

But sometimes you might see it written as P(A and B) or just P(AB). The intersection is shown in the overlapping of the circles in our venn diagram above. As an example, the probability that a card drawn from a deck is both red, and a Queen.

Unions are when either event A, or B, or both occur. Unions are typically denoted by the union symbol symbol (with the common way of expressing an interaction’s probability below):

, and the expression for an union’s probability (below)”)

The union is visualied as the entire area of the two circles in our venn diagram above, minus the area of the intersection. An example would be drawing a red card or a queen. The equation to find the union looks like this:

Complements

Complements are all of the events outside of A, and are typically signified by A’. Due to the fact that the sample space probability adds up to 1, the probability of A’ is equal to 1 – P(A).

To take a useful medical example, imagine we have a union between diabetes and hypertension – the probability that the person has neither of these is 1 – P(A U B). In this way, by knowing the probability of an event, or events, you can easily calculate the probability of an event not occurring, too.

As an example, if A is the probability of you drawing a red card from a deck of cards, then A’ is the probability that you draw anything other than a red card.

Given there are 26 red cards in a deck of cards, the probability you draw one is 26/52 which equals .5 – so A’ = 1 – .5 = .5

Introduction to Conditional Probability

Conditional probability is perhaps easiest to visualize first – you’ve come across a venn diagram before, right? Well let’s just wrap that in a bigger circle which represents our sample space of all possible outcomes.

When it comes to conditional probability, we are narrowing our potential set of outcomes. Given that B has occurred, that narrows our sample space and excludes anything shown in red. It also reduces the probability that A can occur, which is why we only have a small piece of overlap in the venn diagram.

If we want to know what the probability of A occuring, given that B has already occurred, the formula to do so looks like this, where the formula in the numerator is the intersection of events A and B, the chance they happen together:

P(B|A) means the probability of event B occurring, given that A has already occurred.

As an example, what is the probability that somebody is a Kinnu user given that you know they prefer cats? That would be P(Kinnu user|prefers cats) = 12/36 = 33.3%

Conditional probability calculations

Conditional probability is the chance that one event will happen, given that another event has already occurred. It is calculated via the formula below.

Let’s say that 50% of Kinnu users study data science, and 30% study history. 15% study both history and data science. Yesterday you met a random Kinnu user on the street, and you wondered if they do the history pathway like you do. They mentioned that they take data science, but they didn’t mention anything about history. So what are the chances they do?

We know the probability they take history is 0.3 – let’s call that event B. And the probability they are learning data science is 0.5 – that’s event A. The probability they do both, is .15 – which is given as the intersection in our formula above, in the numerator.

So, it turns out there is a 30% chance that your new friend takes history as well. Next time you see them you can ask about it, but it’s not very likely that they do.

Mutually Exclusive Events

Sometimes also called ‘disjoint events’, mutually exclusive events just cannot happen together. For example, imagine that you flip a two sided coin, once. It’s impossible for the flip to be both heads and tails on a single coin flip. Let’s take it one step further though, instead of two possible events, you now have three. You have a handful of delicious, tasty, and colorful M&Ms. Yum.

For simplicity, let’s say we only have three colors left, somebody ate all the rest – not the Kinnu team, I swear! We know that there are 10 M&Ms total – 3 red, 3 green, and 4 blue.

That means our probabilities of picking each are 0.3, 0.3, and 0.4 – but what if we want to know our probability of picking a red OR a green M&M. When we have mutually exclusive events, it’s easy to calculate the probability of a problem like this. All we do is add up the two probabilities like so:

The U symbol that you see means Union – and means that either A or B occurs, but not both.

Frequentist Statistics

A simple example to understand frequentist statistics is to imagine that we didn’t know the probabilities associated with a coin flip. Maybe the coin is weighted heavier on one side which changes the odds? How would we uncover the underlying probability?

As frequentists rely purely on the ‘what is’ of data, meaning, what has already been observed, to a frequentist, probability is what has been observed in the frequencies of different outcomes from a series of controlled trials. Frequentist statistics uses something called Maximum Likelihood Estimation ‘MLE’.

A frequentist coin flip experiment involves hundreds, thousands, or even hundreds of thousands of coin flips – each time the result is recorded, and at the end of the experiment the total number of heads is divided by the total number of coin flips. That is our estimate of the probability of flipping heads, and only after a large number of trials can we estimate it with confidence.

But, frequentist probability has its limitations, because it requires that events be repeatable. However some things just aren’t repeatable, like elections, or a world cup soccer match. Sure, there is an election every year, and a world cup match every 4 years – but not the same election or the exact same world cup match. For events like these, a Bayesian approach is more reliable.

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

What is Probability ?

Introduction to frequentist probability

Introduction to bayesian probability

Unions, Complements, and Intersections

Complements

Introduction to Conditional Probability

Conditional probability calculations

Mutually Exclusive Events

Frequentist Statistics

You will forget 90% of this article in 7 days.