Why “Correlation Isn’t Causation” (and How to Tell the Difference in Everyday Claims)
Two lines moving together can be a clue—or a trap. Use this toolkit to test causal stories for confounding, reverse causation, and bias.

Key Points
- 1Define the claim precisely: correlation means X and Y moved together, while causation means changing X would change Y, all else equal.
- 2Test the four explanations—X→Y, Y→X, confounding, or selection/measurement artifacts—before accepting any causal headline at face value.
- 3Demand stronger support: clear temporality, plausible mechanism, robustness, and converging evidence across methods beat a single impressive-looking correlation.
A headline, a chart, and the story our brains want to tell
The joke is older than most of our news cycles, but it endures because it captures a stubborn habit: we see two things moving together and instinctively reach for a story about why. The human mind prefers causes to coincidences, narratives to noise. Statistics, meanwhile, does not care what we prefer.
“Correlation isn’t causation” has become a slogan precisely because it is so often needed. But the phrase is also misused—wielded as a conversation-stopper whenever evidence is inconvenient. Used well, it is neither a sneer nor a shrug. It’s a prompt to ask a sharper question: What would happen to Y if we changed X—while holding everything else constant?
Correlation isn’t a verdict. It’s a clue that demands interrogation.
— — TheMurrow Editorial
Correlation vs. causation: what the slogan actually means
Causation is a different claim: changing X would change Y (at least on average), “all else equal.” Modern causal inference often frames that as an intervention or counterfactual: What would Y have been if X were different? The Rubin causal model formalizes this intuition in the language of potential outcomes—one outcome if a person receives the exposure (or treatment), another if they don’t. Only one of those outcomes is ever observed, which is why causal claims are hard work rather than casual commentary.
Wikipedia’s entry on “correlation does not imply causation” traces the slogan’s persistence to a familiar logical error: the questionable-cause fallacy, where co-occurrence is treated as proof of a causal chain. Newsrooms see it every week, especially in health, lifestyle, education, and business reporting—domains where randomized experiments are difficult, expensive, or ethically impossible.
Yet the most important nuance is the one the slogan leaves out. “Not proof of causation” does not mean “evidence is meaningless.” A widely cited epidemiology discussion in BMJ argues that causal conclusions often rest on converging evidence—multiple imperfect studies pointing in the same direction—when randomized trials can’t be done. Correlations can be valuable signals. The discipline is learning what they can, and cannot, support.
A practical translation for readers
Key Insight
The four reasons a correlation appears (and the one people forget)
1) X causes Y
2) Y causes X (reverse causation)
3) A third factor causes both X and Y (confounding)
4) Artifacts of selection, measurement, or aggregation create or distort the relationship (biases and paradoxes)
People tend to jump straight to the first explanation because it’s narratively satisfying. A cause gives you agency: avoid the ice cream, prevent the shark attack. Reality is less obliging.
The fourth category—selection and measurement artifacts—often surprises readers because it feels like cheating. How can the data show a relationship that isn’t “real”? The uncomfortable answer: the relationship can be manufactured by who gets counted, how variables are measured, and which slices of the population you’re looking at. A correlation can be statistically “real” and still misleading as a guide to cause.
The diagnostic payoff is straightforward and powerful. This four-part checklist does something crucial: it slows you down. It forces a second question after “Is it correlated?”—namely, “Which of the four explanations is most plausible, and what evidence would separate them?” That shift—from verdict to diagnosis—is the heart of statistical literacy.
Most bad causal claims aren’t lies; they’re stories told too early.
— — TheMurrow Editorial
Four classic explanations for an observed correlation
- 1.X causes Y
- 2.Y causes X (reverse causation)
- 3.A third factor causes both X and Y (confounding)
- 4.Selection, measurement, or aggregation artifacts distort the relationship (biases and paradoxes)
Confounding: the quiet force behind most misleading headlines
The ice-cream-and-sharks example works because it is clean. Ice cream sales rise in summer. Shark attacks rise in summer. The shared driver is season—more people swim, more people buy cold treats. Without adjusting for season, the data make dessert look dangerous.
Causal inference researchers often describe confounding using directed acyclic graphs (DAGs), which map variables and causal pathways. In that language, confounders create “backdoor paths” between X and Y. Proper adjustment—choosing the right variables to control for—aims to block those backdoors.
But “control for everything” is not a scientific principle. It is a common mistake.
Why “just adjust for more variables” can backfire
For readers, the implication is bracing: a study can sound sophisticated—packed with controls and regressions—and still answer the wrong question. When a headline celebrates that researchers “accounted for dozens of factors,” treat it as a prompt to ask: Which factors, and why those?
More controls don’t guarantee more truth. They can guarantee a different kind of error.
— — TheMurrow Editorial
Reader Tip
Reverse causation: when the “effect” drives the “cause”
Health and lifestyle reporting is especially vulnerable. Suppose a study finds: “People who take supplement X have higher rates of condition Y.” A naive reading implies the supplement is harmful. Another plausible story is that people start supplement X after early symptoms appear—meaning Y is driving X.
The key concept here is temporality: did the presumed cause occur before the effect? The BMJ discussion of causal reasoning notes that temporality sits at the center of Bradford Hill–style thinking about causation. Without time order, cause claims float unanchored.
What to look for in the fine print
- Cross-sectional snapshots (measured at one time) are especially vulnerable.
- Longitudinal designs, where X is measured before Y, help establish sequence.
- Language about participants changing behavior “because of” symptoms should trigger caution.
None of this guarantees causality. But it separates “X and Y are connected” from “X precedes Y,” which is the first gate any causal claim must pass.
Reverse causation red flags to scan for
- ✓Cross-sectional design (one-time snapshot)
- ✓Timeline ambiguity (unclear what came first)
- ✓Behavior changes “because of” symptoms
- ✓Lack of baseline measurement before the outcome
Selection bias and collider bias: how sampling can invent relationships
A collider is a variable influenced by both X and Y—a common effect. When you condition on that collider (often by restricting your sample), you can induce a correlation between X and Y even if none exists in the broader population.
A clean intuition: imagine an elite program that admits students who have either exceptional grades or exceptional extracurriculars. In the full applicant pool, grades and extracurriculars might be unrelated. Among admitted students, they can look negatively correlated—because if you got in with weaker grades, you probably compensated with stronger extracurriculars, and vice versa. The selection process creates the pattern.
Why this matters beyond classrooms
An epidemiologic note in the American Journal of Epidemiology adds an important nuance: selection bias is broader than colliders alone. Bias can arise even when the conditioned variable is not a classic collider, depending on how effects differ across groups and whether causal effects are non-null. Translation: sampling problems can be subtle, and “we adjusted for selection” is not always a simple fix.
Common “gates” that can introduce selection or collider bias
- ✓Hospital patients only
- ✓App users or subscribers only
- ✓Survey responders only
- ✓People who stayed employed
- ✓Participants who completed follow-up
- ✓Anyone selected by “being observed”
Correlation as evidence: what serious causal inference actually does
So how do careful researchers move from association to causal judgment?
A widely discussed approach in epidemiology emphasizes triangulation: multiple methods, datasets, and assumptions that converge on the same conclusion. One study might be confounded; several different designs that point the same way deserve attention. The BMJ piece on causal inference underscores that causality can be inferred from patterns of evidence even without a single decisive experiment.
What “better evidence” tends to look like
- Clear temporality (cause measured before effect)
- Plausible mechanism (a coherent story for how X could affect Y)
- Robustness across analyses (results persist under reasonable alternative models)
- Consistency across settings (different populations and designs show similar patterns)
None of these is a magic stamp. They are scaffolding—ways to keep causal claims from collapsing under scrutiny.
What stronger causal evidence often includes
Plausible mechanism (how X could affect Y)
Robustness (holds under reasonable alternative analyses)
Consistency (similar patterns across settings and designs)
A reader’s field guide to headlines that overreach
Good journalism should answer many of these questions for you. When it doesn’t, readers can still ask them.
Treat correlations as leads, not verdicts. That mindset keeps you open to evidence while resistant to narratives that outrun the data.
Seven questions to ask before you believe the causal story
- What could be the confounder? Seasonality is a classic example; so are age, income, baseline health—depending on the topic.
- Did X happen before Y? If the timeline is fuzzy, reverse causation is on the table.
- Who is missing from the sample? If you only observe “survivors,” “users,” “patients,” or “responders,” selection bias may shape the result.
- What did they control for—and why? More adjustment isn’t automatically better.
- Is the claim proportional to the evidence? “Linked to” is not “causes,” and “associated with” is not “proven to.”
- Do other studies using other methods agree? One correlation is a spark; converging evidence is a fire.
Headline pressure-test checklist
- ✓Which of the four explanations fits best (X→Y, Y→X, confounding, artifact)?
- ✓What plausible confounder could drive both variables?
- ✓Is temporality clear—did X occur before Y?
- ✓Who is excluded or missing from the sample?
- ✓What variables did they control for, and why those?
- ✓Is the language proportional (“linked” vs “causes”)?
- ✓Do other methods and studies converge on the same conclusion?
Conclusion: the point isn’t to doubt everything—it’s to doubt well
The slogan survives because the fallacy survives. We are story-making animals staring at spreadsheets. But the best stories—the ones that help us make decisions in medicine, policy, and daily life—earn their endings. They establish time order. They test alternative explanations. They wrestle with confounding and selection. They rely on evidence that converges rather than dazzles.
The next time a chart seems to announce a cause, pause long enough to ask what else could be true. Skepticism, properly used, is not cynicism. It’s intellectual hygiene.
Skepticism, properly used, is not cynicism. It’s intellectual hygiene.
— — TheMurrow Editorial
1) What’s the simplest definition of correlation?
2) If correlation isn’t causation, is correlation useless?
3) What is confounding in plain English?
4) How can reverse causation fool a study?
5) What is collider bias (Berkson’s paradox)?
6) Doesn’t “controlling for more variables” solve the problem?
7) What kind of evidence supports a causal claim when experiments aren’t possible?
Frequently Asked Questions
What’s the simplest definition of correlation?
Correlation means two variables move together in the observed data—either in the same direction (positive) or opposite directions (negative). It describes a pattern, not a reason. The association can be summarized with different statistics depending on the data, such as correlation coefficients for continuous measures or risk/odds ratios for binary outcomes.
If correlation isn’t causation, is correlation useless?
No. Correlation can be valuable evidence—often the first clue that something important is happening. Many fields infer causality through converging evidence when randomized experiments aren’t feasible. The key is to treat correlation as a starting point: test for confounding, reverse causation, and selection effects before making causal claims.
What is confounding in plain English?
Confounding occurs when a third factor influences both the supposed cause and the supposed effect, creating an association that can look causal. The classic example: ice cream sales and shark attacks rise together because summer increases both swimming and ice cream consumption. Without accounting for the third factor (season), the conclusion is misleading.
How can reverse causation fool a study?
Reverse causation happens when the “effect” actually drives the “cause.” For example, if supplement use is higher among people with a condition, it may be because early symptoms led people to start the supplement—not because the supplement caused the condition. Checking temporality (what came first) is one of the strongest basic defenses.
What is collider bias (Berkson’s paradox)?
Collider bias appears when you restrict analysis to a group selected by a factor influenced by both variables. That conditioning can create an association that doesn’t exist in the full population. The elite-program example shows how, among admitted students, grades and extracurriculars can look negatively correlated even if they aren’t correlated overall.
What kind of evidence supports a causal claim when experiments aren’t possible?
Researchers often look for converging evidence: different methods and datasets pointing toward the same conclusion. Helpful features include clear temporality, results that hold up under alternative analyses, and consistency across settings. No single observational study usually “proves” causation, but a well-supported causal case can still be built over time.















