13.3 Conditional Probabilities

We can calculate conditional probabilities, or the probability of one event given another event. This allows us to evaluate the strength of inductive arguments and to calculate how much new evidence should impact our previous beliefs.

13.3 Conditional Probabilities

13.3.1 Probability and Conditionals

Andrey Kolmogorov (1903-1987), a recent probability theorist

What is Conditional Probability?

The conditional probability of an event P(B|A) is the probability that an event B will occur given that another event A has occurred.

Our estimation of the likelihood of an event can change if we know that some other event has occurred. For example, the probability that a rolled die shows a 2 is ⅙ without any other information, but if someone looks at the die and tells you that it is an even number, the probability is now ⅓ that it is a 2. The notation P(B|A) indicates a conditional probability, meaning it indicates the probability of one event B under the condition that we know another event A has happened. The bar “|” can be read as “given”, so that P(B|A) is read as “the probability of B given that A has occurred”.

The conditional probability of an event B given an event A is defined by the following equation:

P(B|A) = P(A & B) ÷ P(A)

In other words, the probability of B given A is equal to the probability of A and B occuring together, divided by the probability of A occuring. (Since probabilities are decimals, dividing by a probability increases the value of the resulting number). For example, suppose that the probability P(A) that Rachel will spend the most money campaigning for Congress is 50%, and the probability P(A & B) that she will both spend the most money and win the election is 20%. Suppose we then get the information that she does spend the most money campaigning for Congress. Now, the probability that she will win the election given that she spend the most money is P(B|A) = 20% ÷ 50% = 40%.

Calculating Conditional Probability

Let’s work through an example of how to calculate conditional probabilities. Suppose that a coin is flipped 3 times, and we want to know the probability that at least one head occurs, given that all 3 coin flips are the same.

Let ‘H’ stand for ‘heads’ and ‘T’ stand for ‘tails’. There are 8 possible combinations of flips of the coin: {HHH, HHT, HTH, THH, TTH, THT, HTT, TTT}.

Each individual outcome has probability ⅛. Let B be the event that at least one heads occurs. Then, P(B) = ⅞, since 7 of the 8 possible combinations of flips involve at least one heads. Let A be the event that all 3 coin flips are the same, either HHH or TTT. So, P(A) = ¼. What is the probability of both B and A? Only one of those eight possibilities is one in which both B and A could occur: HHH. So the probability of both occurring P(A & B) = ⅛.

Now, the probability that at least one head occurs, given that all three coin flips are the same, P(B|A) = P(A & B) ÷ P(A) = ⅛ ÷ ¼ = ½. (In percentages, 12.5% ÷ 25% = 50%.

Independent Events

Notice that the conditional probability of B given A (½) is not equal to the unconditional probability of B (⅞). This is because A provides extra information that changes the probability that B occurs. Once we know that A occurs and that all three flips of the coin are the same, the probability of B drops. This means that A and B are dependent events. If A and B were independent events, on the other hand, then the probability of B given A would be the same as the probability of B. For instance, the probability that a coin flip was heads, given the information that you won the lottery yesterday, would still be 50%, since the coin flip and the lottery win are independent events.

13.3.2 Probability of a Conclusion Given the Premises

Blaise Pascal (1623-1662), an early developer of probability theory

Probability of the Conclusion of a Valid Argument

Now that we know how to calculate the conditional probability of an event, we know how to calculate the probability that a conclusion is true, given that the premises are true.

For a valid deductive argument, the answer is quite simple. Since there is no possibility of all of the premises being true with the conclusion false, the probability of the conclusion given the premises is 100%. The probability of the conclusion of the argument is equal to the probability of all the premises being true. For instance, if the probability of Premise 1 is 40%, and the probability of Premise 2 is 20%, and the conclusion necessarily follows from Premise 1 and 2, then the probability of the conclusion is the joint probability of Premise 1 and Premise 2. Assuming the two premises describe independent events, the probability of the conclusion on its own would be 40% x 20% = 8%. The probability of the conclusion given the premises, on the other hand, would be 8% ÷ 8% = 100%.

On the other hand, suppose we are very sure of our premises: the probability of Premise 1 is 90%, and the probability of premise 2 is 99%. The probability of the conclusion on its own would be 90% x 99% = 89.1%. But the probability of the conclusion given the premises would be 89.1% ÷ 89.1% = 100%

Inductive Strength

Some arguments are inductive rather than deductive, however. The premises of an inductive argument do not guarantee that the conclusion is true. For instance, here is an inductive argument:

1. Most customers who order coffee here want their coffee black.

2. Angie is a customer who ordered coffee here.

C. Angie wants her coffee black.

The argument isn’t deductively valid, because Premise 1 says most rather than all. Still, this sort of argument is very useful in ordinary life, when we often deal with situations which aren’t entirely certain or which have exceptions. We now have the ability to calculate the inductive strength of the argument, the probability that the conclusion is true given that the premises are true.

A stronger inductive argument is one where the probability of the conclusion given the premises is higher. The inductive strength of an argument is calculated this way:

inductive strength = probability of the premises with the conclusion ÷ probability of the premises

Let’s say that we are 90% sure of premise 1 and 95% sure of premise 2. The probability of the conclusion given the premises is equal to the probability of the conjunction of the conclusion with the premises, divided by the probability of the premises. The probability of both premises both being true is 90% x 95% = 85.5%. Suppose that the probability of the conjunction of the conclusion with the premises — the percentage of the time that both premises are true and the conclusion is also true — is 60%. Then the probability of the conclusion given the premises is 60% ÷ 85.5% = 70.175%

Notice that what matters for inductive strength is whether the premises increase the probability of the conclusion, not whether the premises or the conclusion are themselves probable. If the premises were true, how likely would that make the conclusion? To see this, compare these two very simple arguments:

1. Space aliens will nuke earth this year. (0.0009%)

C. The economy will collapse this year.

Probability of the conclusion and premises both happening: 0.0008%

1. Congress will raise taxes this year. (75%)

C. The economy will collapse this year.

Probability of conclusion and premises both happening: 20%

The premise of the second argument is clearly far more likely than the premise of the first argument. The first argument, however, has more inductive strength. That is, if Congress were to raise taxes this year, the economy might collapse, but it probably won’t. On the other hand, if aliens nuke the planet this year, it is hard to imagine how the economy wouldn’t collapse.

The probability that the economy will collapse given that Congress will raise taxes is 20% ÷ 75% = 25%.

The probability that the economy will collapse given that Aliens will nuke earth this year is 0.0008% ÷ 0.0009% = 88.9%.

Probability of the Conclusion together with the Premises

Of course, the argument about aliens, while it has a lot of inductive strength, still isn’t a very good argument. An inductive argument is best when it is both strong and also there is a high probability that the premises are true. For instance:

1. Congress will raise taxes this year (75%)

C. There will be more tax revenue this year

Suppose we know that this is an inductively strong argument: there is a 90% probability of the conclusion given the premises. Then it is a very good inductive argument, because it is both strong and also there is a high probability that the premises are true. What is the probability that the conclusion and the premises are both true? That would be the value of p:

Inductive Strength = Probability of premises with conclusion ÷ probability of the premises

90% = p ÷ 75%

From this it follows that:

probability of the premises with conclusion = Inductive strength x Probability of premises

p =90% x 75%

p = 63%

So, the probability that Congress will raise taxes and there will be more tax revenue is 63%. By contrast, while the probability of an economic collapse given an alien nuclear war is very high, the probability of an economic collapse with an alien nuclear war is very low.

13.3.3 Probability and Symmetry Over Time

Thomas Bayes (1702-1761), who discovered Bayes’ Theorem

Bayes’ Theorem

Probabilities are not fixed, because our evidence constantly changes. At one time, an event might seem very likely, until new evidence comes in that makes it seem much less likely. Bayes’ theorem allows us to see how the probability of an event changes over time.

Bayes’ theorem gives the relationship between the probabilities of two events A and B, P(A) and P(B), and the conditional probabilities of A given B and B given A. It is stated in this way:

P(A|B) = P(B|A) x P(A) ÷ P(B)

The probability of A given B is equal to the probability of B given A, times the probability of A, divided by the probability of B.

An equivalent version of Bayes’ theorem is stated this way:

P(A|B) ÷ P(B|A) = P(A) ÷ P(B)

The probability of A given B, divided by the probability of B given A, is equal to the probability of A divided by the probability of B.

Example of Bayes’ Theorem

Suppose someone told you they had a nice conversation with someone on the train. Not knowing anything else about this conversation, assuming the proportion of men and women in the city is about 50/50, the probability that they were speaking to a woman is 50%.

Now suppose they also told you that this person had long hair. Suppose that in the culture of their city, women are more likely to have long hair than men: 75% of women have long hair, but only 40% of the general population have long hair, because only 5% of men have long hair. It is now more likely they were speaking to a woman, since women in in this city are more likely to have long hair than men. Bayes’s theorem can be used to calculate the probability that the person is a woman.

To see how this is done, let W represent the event that the conversation was held with a woman, and L denote the event that the conversation was held with a long-haired person. The probability that W occurs is P(W) = 50%, since half the population are women, and the probability that L occurs is P(L) = 40%, since only 40% of people in the city have long hair. The probability that L occurs given W is P(L|W) = 75%, since 75% of women in the city have long hair.

Our goal is to calculate the probability that the conversation was held with a woman, given the fact that the person had long hair, or, in our notation, P(W|L). Using the formula for Bayes’ theorem, we have:

P(W|L) = P(L|W) x P(W) ÷ P(L)

P(W|L) = 75% x 50% ÷ 40%

P(W|L) = 37.5% ÷ 40%

P(W|L) = 93.75%

Changing Probabilities

Earlier in the course, we discussed the importance of adjusting beliefs in the face of new evidence. It is important neither to be too steadfast (so that we hold fast to our previous beliefs regardless of changes in evidence), but also not too open minded (so that our beliefs are based only on the most recent evidence we see). Bayes’s theorem can help us understand the right degree to which we should adjust our beliefs in the face of new evidence.

Suppose you wake up in the morning believing it is 40% likely that there will be heavy traffic on the road on the way to school or work that delays you by more than 10 minutes, and 60% likely that you’ll have a smooth ride without a backup. This is based on your past experience: 40% of the days you’ve faced traffic, and 60% of the days you’ve not faced traffic. We’ll say the prior probability of traffic, before any new evidence comes in, is P(T) = 40%.

Suppose now you get some new evidence: you hear a lot of sirens from emergency vehicles outside of your window. The emergency vehicles might be headed for an accident on your route, which would mean you would face traffic. But they might be headed for someplace else. How should your belief change in light of the new evidence? That is, what is the new or “posterior” probability that there is traffic, given the new information that you’ve heard emergency sirens, P(T|S)? Bayes theorem says:

P(T|S) = P(S|T) x P(T) ÷ P(S)

That is, the probability of traffic given sirens is equal to the probability of sirens given traffic (how likely it is, if you face traffic, that there would be emergency sirens), multiplied by the prior probability of traffic, divided by the probability of a siren.

Suppose that sirens only occur on about 5% of mornings, so P(S) = 5%. There are also many causes of traffic besides accidents and emergency vehicles, however: only 10% of the time that you face traffic are there emergency sirens, since other factors usually produce the traffic, so P(S|T) = 10%. We get:

P(T|S) = 10% x 40% ÷ 5%

P(T|S) = 4% ÷ 5%

P(T|S) = 80%

So, now that you know there are sirens, you can be 80% sure that you will face traffic on your drive: the likelihood of traffic has doubled from 40% to 80%.

Submodule 13.3 Quiz

Licenses and Attributions

Key Sources:

Watson, Jeffrey (2019). Introduction to Logic. Licensed under: (CC BY-SA).
Modified with additions from Boundless Statistics and is reused under a CC BY-SA 4.0 license.
Andrej Nikolajewitsch Kolmogorov by Konrad Jacobs shared under license CC BY-SA 2.0 DE.

Next Page: 14.1 Review of Modules 8-13

13.3 Conditional Probabilities

13.3 Conditional Probabilities

Table of Contents

13.3.1 Probability and Conditionals

What is Conditional Probability?

Calculating Conditional Probability

Independent Events

13.3.2 Probability of a Conclusion Given the Premises

Probability of the Conclusion of a Valid Argument

Inductive Strength

Probability of the Conclusion together with the Premises

13.3.3 Probability and Symmetry Over Time

Bayes’ Theorem

Example of Bayes’ Theorem

Changing Probabilities

Submodule 13.3 Quiz