Why “Correlation Isn’t Causation” (and How to Tell the Difference in Everyday Claims)

Q: What’s the simplest definition of correlation?

**Correlation** means two variables move together in the observed data—either in the same direction (positive) or opposite directions (negative). It describes a pattern, not a reason. The association can be summarized with different statistics depending on the data, such as correlation coefficients for continuous measures or risk/odds ratios for binary outcomes.

Q: What is confounding in plain English?

**Confounding** occurs when a third factor influences both the supposed cause and the supposed effect, creating an association that can look causal. The classic example: ice cream sales and shark attacks rise together because summer increases both swimming and ice cream consumption. Without accounting for the third factor (season), the conclusion is misleading.

Q: What is collider bias (Berkson’s paradox)?

**Collider bias** appears when you restrict analysis to a group selected by a factor influenced by both variables. That conditioning can create an association that doesn’t exist in the full population. The elite-program example shows how, among admitted students, grades and extracurriculars can look negatively correlated even if they aren’t correlated overall.

Q: What kind of evidence supports a causal claim when experiments aren’t possible?

Researchers often look for **converging evidence**: different methods and datasets pointing toward the same conclusion. Helpful features include clear temporality, results that hold up under alternative analyses, and consistency across settings. No single observational study usually “proves” causation, but a well-supported causal case can still be built over time.

Two lines moving together can be a clue—or a trap. Use this toolkit to test causal stories for confounding, reverse causation, and bias.

By TheMurrow Editorial

January 29, 2026

Why “Correlation Isn’t Causation” (and How to Tell the Difference in Everyday Claims)

Key Points

1Define the claim precisely: correlation means X and Y moved together, while causation means changing X would change Y, all else equal.
2Test the four explanations—X→Y, Y→X, confounding, or selection/measurement artifacts—before accepting any causal headline at face value.
3Demand stronger support: clear temporality, plausible mechanism, robustness, and converging evidence across methods beat a single impressive-looking correlation.

A headline, a chart, and the story our brains want to tell

A summer headline blares: Ice cream sales linked to shark attacks. The chart looks convincing—two lines rising and falling together like synchronized swimmers. Somewhere, a reader decides to skip dessert. Somewhere else, a policymaker wonders whether to regulate soft-serve machines.

The joke is older than most of our news cycles, but it endures because it captures a stubborn habit: we see two things moving together and instinctively reach for a story about why. The human mind prefers causes to coincidences, narratives to noise. Statistics, meanwhile, does not care what we prefer.

“Correlation isn’t causation” has become a slogan precisely because it is so often needed. But the phrase is also misused—wielded as a conversation-stopper whenever evidence is inconvenient. Used well, it is neither a sneer nor a shrug. It’s a prompt to ask a sharper question: What would happen to Y if we changed X—while holding everything else constant?

Correlation isn’t a verdict. It’s a clue that demands interrogation.
— — TheMurrow Editorial

~200

Words per minute used to estimate reading time for this explainer (a standard newsroom rule of thumb).

Correlation vs. causation: what the slogan actually means

People say correlation when they mean “these two things are associated in the data.” That association can be measured in multiple ways: a correlation coefficient for continuous variables, or a risk ratio or odds ratio when the outcome is binary. What matters is the basic observation: when X changes, Y tends to change too—positively or negatively.

Causation is a different claim: changing X would change Y (at least on average), “all else equal.” Modern causal inference often frames that as an intervention or counterfactual: What would Y have been if X were different? The Rubin causal model formalizes this intuition in the language of potential outcomes—one outcome if a person receives the exposure (or treatment), another if they don’t. Only one of those outcomes is ever observed, which is why causal claims are hard work rather than casual commentary.

Wikipedia’s entry on “correlation does not imply causation” traces the slogan’s persistence to a familiar logical error: the questionable-cause fallacy, where co-occurrence is treated as proof of a causal chain. Newsrooms see it every week, especially in health, lifestyle, education, and business reporting—domains where randomized experiments are difficult, expensive, or ethically impossible.

Yet the most important nuance is the one the slogan leaves out. “Not proof of causation” does not mean “evidence is meaningless.” A widely cited epidemiology discussion in BMJ argues that causal conclusions often rest on converging evidence—multiple imperfect studies pointing in the same direction—when randomized trials can’t be done. Correlations can be valuable signals. The discipline is learning what they can, and cannot, support.

A practical translation for readers

When you see a correlation claim, silently swap in this sentence: “In the observed data, X and Y moved together.” Then ask what else might explain it. The rest of this article is a reader’s toolkit for doing that without a graduate seminar.

Key Insight

“Not proof of causation” does not mean “evidence is meaningless.” Correlations can be valuable signals—especially when multiple imperfect studies converge.

Potential outcomes exist in the Rubin causal model—treated vs. not treated—but only one is ever observed, complicating causal claims.

The four reasons a correlation appears (and the one people forget)

A correlation between X and Y can arise for four classic reasons, a checklist that appears in causal inference primers and causal-diagram introductions:

1) X causes Y
2) Y causes X (reverse causation)
3) A third factor causes both X and Y (confounding)
4) Artifacts of selection, measurement, or aggregation create or distort the relationship (biases and paradoxes)

People tend to jump straight to the first explanation because it’s narratively satisfying. A cause gives you agency: avoid the ice cream, prevent the shark attack. Reality is less obliging.

The fourth category—selection and measurement artifacts—often surprises readers because it feels like cheating. How can the data show a relationship that isn’t “real”? The uncomfortable answer: the relationship can be manufactured by who gets counted, how variables are measured, and which slices of the population you’re looking at. A correlation can be statistically “real” and still misleading as a guide to cause.

The diagnostic payoff is straightforward and powerful. This four-part checklist does something crucial: it slows you down. It forces a second question after “Is it correlated?”—namely, “Which of the four explanations is most plausible, and what evidence would separate them?” That shift—from verdict to diagnosis—is the heart of statistical literacy.

Most bad causal claims aren’t lies; they’re stories told too early.
— — TheMurrow Editorial

Four classic explanations for an observed correlation

1.X causes Y
2.Y causes X (reverse causation)
3.A third factor causes both X and Y (confounding)
4.Selection, measurement, or aggregation artifacts distort the relationship (biases and paradoxes)

Classic buckets for why two variables can move together: causation, reverse causation, confounding, or artifacts like selection/measurement/aggregation.

Confounding: the quiet force behind most misleading headlines

Confounding is the most common reason correlation misleads. It happens when a third variable influences both the proposed cause and the proposed effect, producing an association even when no direct causal link exists—or masking one that does.

The ice-cream-and-sharks example works because it is clean. Ice cream sales rise in summer. Shark attacks rise in summer. The shared driver is season—more people swim, more people buy cold treats. Without adjusting for season, the data make dessert look dangerous.

Causal inference researchers often describe confounding using directed acyclic graphs (DAGs), which map variables and causal pathways. In that language, confounders create “backdoor paths” between X and Y. Proper adjustment—choosing the right variables to control for—aims to block those backdoors.

But “control for everything” is not a scientific principle. It is a common mistake.

Why “just adjust for more variables” can backfire

Causal-diagram methods emphasize that some controls introduce bias rather than remove it. Overadjustment can accidentally block part of the causal effect you want to measure, or open new pathways that weren’t there before. The Cambridge overview on causal diagrams and confounding stresses this point: adjustment is not a moral virtue; it’s a strategy that must match the causal structure.

For readers, the implication is bracing: a study can sound sophisticated—packed with controls and regressions—and still answer the wrong question. When a headline celebrates that researchers “accounted for dozens of factors,” treat it as a prompt to ask: Which factors, and why those?

More controls don’t guarantee more truth. They can guarantee a different kind of error.
— — TheMurrow Editorial

Reader Tip

When you see “accounted for dozens of factors,” don’t stop there—ask which factors were adjusted for, and why those were chosen.

Reverse causation: when the “effect” drives the “cause”

Reverse causation is the trapdoor beneath many confident-sounding associations. The observed pattern—X linked to Y—may be true, but the direction is flipped: Y causes X.

Health and lifestyle reporting is especially vulnerable. Suppose a study finds: “People who take supplement X have higher rates of condition Y.” A naive reading implies the supplement is harmful. Another plausible story is that people start supplement X after early symptoms appear—meaning Y is driving X.

The key concept here is temporality: did the presumed cause occur before the effect? The BMJ discussion of causal reasoning notes that temporality sits at the center of Bradford Hill–style thinking about causation. Without time order, cause claims float unanchored.

What to look for in the fine print

Readers can often spot reverse causation risk by scanning for study design cues:

- Cross-sectional snapshots (measured at one time) are especially vulnerable.
- Longitudinal designs, where X is measured before Y, help establish sequence.
- Language about participants changing behavior “because of” symptoms should trigger caution.

None of this guarantees causality. But it separates “X and Y are connected” from “X precedes Y,” which is the first gate any causal claim must pass.

Reverse causation red flags to scan for

✓Cross-sectional design (one-time snapshot)
✓Timeline ambiguity (unclear what came first)
✓Behavior changes “because of” symptoms
✓Lack of baseline measurement before the outcome

First basic gate for a causal story: temporality—did the proposed cause occur before the effect?

Selection bias and collider bias: how sampling can invent relationships

Selection bias is not merely a technical quibble; it can manufacture correlations that feel psychologically convincing. One of the most famous forms is collider bias, also known as Berkson’s paradox.

A collider is a variable influenced by both X and Y—a common effect. When you condition on that collider (often by restricting your sample), you can induce a correlation between X and Y even if none exists in the broader population.

A clean intuition: imagine an elite program that admits students who have either exceptional grades or exceptional extracurriculars. In the full applicant pool, grades and extracurriculars might be unrelated. Among admitted students, they can look negatively correlated—because if you got in with weaker grades, you probably compensated with stronger extracurriculars, and vice versa. The selection process creates the pattern.

Why this matters beyond classrooms

Collider bias appears in real research when studies focus only on people who pass through a gate: hospital patients, app users, subscribers, people who answered a survey, people who remained employed, people who completed follow-up. Conditioning on “being observed” can distort relationships among the observed.

An epidemiologic note in the American Journal of Epidemiology adds an important nuance: selection bias is broader than colliders alone. Bias can arise even when the conditioned variable is not a classic collider, depending on how effects differ across groups and whether causal effects are non-null. Translation: sampling problems can be subtle, and “we adjusted for selection” is not always a simple fix.

Common “gates” that can introduce selection or collider bias

✓Hospital patients only
✓App users or subscribers only
✓Survey responders only
✓People who stayed employed
✓Participants who completed follow-up
✓Anyone selected by “being observed”

A correlation can appear among selected participants even when the true population-level relationship is effectively zero—conditioning can induce it.

Correlation as evidence: what serious causal inference actually does

The temptation is to treat “correlation isn’t causation” as a blanket dismissal. That posture is comforting and often wrong. In many domains—economics, epidemiology, social science—randomized experiments are infeasible or unethical. Yet societies still must decide: which interventions to fund, which exposures to regulate, which behaviors to encourage.

So how do careful researchers move from association to causal judgment?

A widely discussed approach in epidemiology emphasizes triangulation: multiple methods, datasets, and assumptions that converge on the same conclusion. One study might be confounded; several different designs that point the same way deserve attention. The BMJ piece on causal inference underscores that causality can be inferred from patterns of evidence even without a single decisive experiment.

What “better evidence” tends to look like

In plain terms, stronger causal arguments often have some combination of:

- Clear temporality (cause measured before effect)
- Plausible mechanism (a coherent story for how X could affect Y)
- Robustness across analyses (results persist under reasonable alternative models)
- Consistency across settings (different populations and designs show similar patterns)

None of these is a magic stamp. They are scaffolding—ways to keep causal claims from collapsing under scrutiny.

What stronger causal evidence often includes

Clear temporality (cause before effect)
Plausible mechanism (how X could affect Y)
Robustness (holds under reasonable alternative analyses)
Consistency (similar patterns across settings and designs)

A reader’s field guide to headlines that overreach

Most readers don’t need to compute an odds ratio to detect a weak causal claim. They need a set of questions that pressure-test the story without cynicism.

Good journalism should answer many of these questions for you. When it doesn’t, readers can still ask them.

Treat correlations as leads, not verdicts. That mindset keeps you open to evidence while resistant to narratives that outrun the data.

Seven questions to ask before you believe the causal story

- What are the four explanations here? (X→Y, Y→X, confounder, or artifact)
- What could be the confounder? Seasonality is a classic example; so are age, income, baseline health—depending on the topic.
- Did X happen before Y? If the timeline is fuzzy, reverse causation is on the table.
- Who is missing from the sample? If you only observe “survivors,” “users,” “patients,” or “responders,” selection bias may shape the result.
- What did they control for—and why? More adjustment isn’t automatically better.
- Is the claim proportional to the evidence? “Linked to” is not “causes,” and “associated with” is not “proven to.”
- Do other studies using other methods agree? One correlation is a spark; converging evidence is a fire.

Headline pressure-test checklist

✓Which of the four explanations fits best (X→Y, Y→X, confounding, artifact)?
✓What plausible confounder could drive both variables?
✓Is temporality clear—did X occur before Y?
✓Who is excluded or missing from the sample?
✓What variables did they control for, and why those?
✓Is the language proportional (“linked” vs “causes”)?
✓Do other methods and studies converge on the same conclusion?

Conclusion: the point isn’t to doubt everything—it’s to doubt well

A correlation is a real pattern in observed data. Dismissing it outright is as careless as believing it instantly. The mature stance holds two thoughts at once: correlations can be meaningful, and correlations can mislead.

The slogan survives because the fallacy survives. We are story-making animals staring at spreadsheets. But the best stories—the ones that help us make decisions in medicine, policy, and daily life—earn their endings. They establish time order. They test alternative explanations. They wrestle with confounding and selection. They rely on evidence that converges rather than dazzles.

The next time a chart seems to announce a cause, pause long enough to ask what else could be true. Skepticism, properly used, is not cynicism. It’s intellectual hygiene.

Skepticism, properly used, is not cynicism. It’s intellectual hygiene.
— — TheMurrow Editorial

1) What’s the simplest definition of correlation?

Correlation means two variables move together in the observed data—either in the same direction (positive) or opposite directions (negative). It describes a pattern, not a reason. The association can be summarized with different statistics depending on the data, such as correlation coefficients for continuous measures or risk/odds ratios for binary outcomes.

2) If correlation isn’t causation, is correlation useless?

No. Correlation can be valuable evidence—often the first clue that something important is happening. Many fields infer causality through converging evidence when randomized experiments aren’t feasible. The key is to treat correlation as a starting point: test for confounding, reverse causation, and selection effects before making causal claims.

3) What is confounding in plain English?

Confounding occurs when a third factor influences both the supposed cause and the supposed effect, creating an association that can look causal. The classic example: ice cream sales and shark attacks rise together because summer increases both swimming and ice cream consumption. Without accounting for the third factor (season), the conclusion is misleading.

4) How can reverse causation fool a study?

Reverse causation happens when the “effect” actually drives the “cause.” For example, if supplement use is higher among people with a condition, it may be because early symptoms led people to start the supplement—not because the supplement caused the condition. Checking temporality (what came first) is one of the strongest basic defenses.

5) What is collider bias (Berkson’s paradox)?

Collider bias appears when you restrict analysis to a group selected by a factor influenced by both variables. That conditioning can create an association that doesn’t exist in the full population. The elite-program example shows how, among admitted students, grades and extracurriculars can look negatively correlated even if they aren’t correlated overall.

6) Doesn’t “controlling for more variables” solve the problem?

Not necessarily. Some variables are appropriate to adjust for (to reduce confounding), but others can introduce bias—especially if they act like colliders or if adjustment blocks part of the causal pathway you’re trying to measure. Causal-diagram thinking emphasizes that adjustment decisions should follow a causal model, not a reflex.

7) What kind of evidence supports a causal claim when experiments aren’t possible?

Researchers often look for converging evidence: different methods and datasets pointing toward the same conclusion. Helpful features include clear temporality, results that hold up under alternative analyses, and consistency across settings. No single observational study usually “proves” causation, but a well-supported causal case can still be built over time.

About the Author

TheMurrow Editorial is a writer for TheMurrow covering explainers.

Frequently Asked Questions

What’s the simplest definition of correlation?

If correlation isn’t causation, is correlation useless?

What is confounding in plain English?

How can reverse causation fool a study?

What is collider bias (Berkson’s paradox)?

What kind of evidence supports a causal claim when experiments aren’t possible?

More in Explainers

Explainers·May 17

AI Agents Are Becoming Your Middleman—But Here’s the 2-Line Web ‘Handshake’ That Determines Whether They Can Buy, Book, or Break Things

A plain-text file at your domain root still decides what many automated systems can reach—just as agents shift from reading pages to taking actions. The catch: it’s a handshake, not a lock.

Explainers·May 7

Apple’s ‘Encrypted RCS’ Fix Is Real—So Why Are Your “Green Bubble” Texts Still Less Private (and sometimes less reliable) than you think?

Apple says iOS 26.5 brings end‑to‑end encrypted RCS—but it’s beta, carrier‑gated, and threads can still downgrade to SMS/MMS. The color never promised privacy.

Explainers·May 4

America’s $800 ‘Duty‑Free’ Rule Is Collapsing in 2026—Here’s the Shipping Trick That Quietly Kept Your Shein/Temu Hauls Cheap (and what replaces it)

That “price magic” wasn’t logistics—it was Section 321 de minimis. EO 14324 flips the duty‑free switch off for most shipments, changing checkout totals, clearance, and fulfillment strategy.

Explainers·Apr 29

The Age-Verification Trick Lawmakers Aren’t Saying Out Loud: ‘Protect the Kids’ Bills That Turn Your Phone Into an ID Scanner (Even If You Don’t Have Kids)

The laws aren’t just targeting porn sites or social apps anymore—they’re targeting the chokepoints: app stores and even operating systems. To identify minors, the system has to process everyone, building a durable age/ID layer into everyday phone use.

Explainers·Apr 25

Your 2026 A/C Isn’t Being ‘Phased Out’—It’s Being Reclassified as a Fire Risk (and That’s Why Quotes Are Jumping by 20–40%)

The EPA’s shift is climate policy—GWP limits for new equipment—not a recall of what you already own. But the replacement refrigerants are often A2L “mildly flammable,” and that’s what’s changing codes, installs, labels, and prices.

Explainers·Apr 7

Half of America’s ‘AI Data Centers’ Aren’t Getting Built—So Why Are Your Electric Bills Still Rising? The Interconnection-Queue Trick Utilities Won’t Stop Using

Utilities are treating massive AI-related load requests like inevitable demand—even when many entries are duplicative, speculative, or never built. That paperwork can still steer billions in grid upgrades and show up in your rates before a single server rack turns on.

Explainers·Mar 29

AI Training Lawsuits Aren’t Really About “Fair Use” Anymore — They’re a Discovery War Over the One Dataset You’re Not Allowed to See

“Fair use” drives the headlines, but discovery drives leverage. The real fight is over what must be preserved, produced, and explained—then locked behind Attorneys’ Eyes Only.

Explainers·Mar 13

California’s One-Click Data-Deletion Tool Goes Live Aug. 1, 2026—So Why Might Your Data Spread Faster After You Click?

California’s DROP portal lets residents broadcast one deletion request to every registered data broker—but processing starts later, runs in cycles, and may require you to share more identifiers first.

Sports·May 24

Pro Cycling Tried to Ban One Gear Combo—Then a Competition Court Said ‘No.’ Here’s Why a Bike Part Fight Could Decide the Next Wave of Safety Rules

A proposed UCI “54×11” maximum gearing trial was pitched as safety—but Belgian authorities said the process wasn’t transparent or proportionate, and it hit one supplier hardest. Now the sport’s next safety rules may depend on how they’re justified, staged, and enforced.

Health & Wellness·May 24

The FDA’s June 30 GLP-1 Deadline Isn’t About Weight Loss — It’s About ‘Copycat’ Chemistry (and why your injection may suddenly stop working)

June 30 isn’t a patient stop-date—it’s the close of an FDA public-comment window that could squeeze industrial compounding (503B) even as patient-specific compounding (503A) remains narrower, but not gone.

Travel·May 24

Your Face Is Becoming Your Boarding Pass—But Here’s the Part Nobody Tells You: You’re Still Re-Enrolling at Every Airport in 2026

Biometric lanes are real—but the U.S. built them as separate TSA, CBP, and airline systems. So the “one identity everywhere” promise still breaks the moment you change airports or carriers.

Style & Fashion·May 24

Europe’s July 19 Clothing Ban Sounds Like a Sustainability Win — So Why Are Brands Suddenly Obsessed With ‘Fit Tech’ and Smaller Returns?

The EU isn’t banning clothing—it’s banning the destruction of unsold apparel for large companies starting July 19, 2026. Once shredding is off the table, brands will chase the next biggest waste lever: fit-driven returns.

Business & Money·May 24

Stablecoins Aren’t ‘Digital Dollars’—They’re Short-Term Treasury Megafunds: The New Yield Loophole Banks Are Fighting (and why it could reshape your checking account by 2027)

USDC and USDT don’t run on piles of cash—they run on rolling T-bills and repo that generate real yield. The token stays at $1, but the portfolio underneath (and who captures the interest) is the real story.

World News·May 24

Bangladesh just passed 500 child deaths from measles — and the ‘contained’ outbreak is still spreading

The death toll’s headline number masks a crucial definitional split—lab-confirmed vs. “measles-like symptoms.” Meanwhile, WHO says 58 of 64 districts are affected, and emergency vaccination has escalated nationwide.

Opinion·May 24

Trump Says an Iran Deal Is Coming ‘Shortly.’ Here’s the Catch: A Hormuz ‘Victory’ Could Lock In $5 Gas for Months—and Make Washington Call It Peace

A ceasefire headline can move markets in hours, but safe, routine shipping through Hormuz is rebuilt on the water—via mine-clearing, insurance repricing, and proven transit. That lag is where $5 gas can stick even after Washington declares “peace.”

Reviews·May 23

Apple’s App Store Now Shows AI ‘Review Summaries’—Here’s the 3-Star Pattern They Can’t See (and the $9.99 Trap It Hides)

Apple is elevating an AI-written paragraph above the review pile—turning messy human feedback into a single, authoritative voice. That convenience can also smooth extremes, amplify manipulation, and quietly reshape what shoppers tolerate and what developers get blamed for.