If AI ‘Impact Summits’ Don’t Come With Hard Accountability, They’re Just Expensive PR
Between 2023 and 2025, AI summit diplomacy scaled up fast. But without enforceable obligations, independent verification, transparency, and consequences, “impact” becomes branding.

Key Points
- 1Demand binding enforcement: Declarations and “commitments” mean little without law, independent verification, transparent reporting, and real penalties for failure.
- 2Track the summit cycle’s gap: Bletchley mainstreamed frontier risk, Seoul added thresholds and evaluations, Paris showed consensus fragility when powers won’t sign.
- 3Use an adult checklist: Ask what’s binding, who verifies, what’s published, and what happens when safety claims or evaluations fail.
The most revealing moments at global AI summits rarely happen onstage. They happen in the margins—when a government official is asked who will verify a company’s safety claims, and the answer becomes a polite reference to “future work.” Or when a firm praises “credible evaluations,” but won’t say what counts as credible, who gets access, or what happens if a model fails.
Between 2023 and 2025, “impact summits” on AI safety and governance multiplied in scale and ambition. Leaders flew in. Declarations were signed. “Concrete actions” were announced. The vocabulary matured: frontier models, external evaluation, thresholds, safety institutes.
And yet the central question—who is accountable to whom, by what rules, and with what consequences—kept slipping through the cracks.
Summits are not useless. They can set agendas, create shared language, and accelerate real institutions. But without hard accountability—law, independent verification, transparent reporting, and penalties—summit outputs risk becoming a new genre of PR: commitment washing dressed up as diplomacy.
Without enforcement, ‘AI safety’ becomes a photo-op—impressive on stage, weightless in practice.
— — TheMurrow Editorial
The accountability problem: when “impact” is a press release
That intent matters. A common vocabulary can reduce confusion and raise baseline expectations. Summit statements also signal priorities to regulators, investors, and the public. But intent is not accountability.
A useful working definition separates the two:
- Impact summits: high-profile convenings where actors announce principles, “commitments,” and actions on AI safety, ethics, and societal benefit.
- Hard accountability: enforceable obligations (law or regulation), credible independent verification (audits, evaluations), transparent reporting, and consequences for noncompliance.
Across the major AI summits from 2023–2025, a pattern emerges in the official record: outputs emphasize shared goals more than enforcement mechanisms. The Bletchley Declaration (UK, 2023) is a political agreement—valuable for agenda-setting, not inherently enforceable. The Seoul Ministerial Statement (2024) and industry Frontier AI Safety Commitments add detail, but much remains voluntary. The Paris AI Action Summit (2025) underscored how symbolic these outcomes can become when major powers decline to sign.
Readers should treat summit communications like any other public claim: ask what is binding, what is verifiable, and what happens if the promise is broken.
What accountability looks like in practice
- Clear standards: what must be tested, disclosed, and mitigated.
- Independent evaluation: who can test and under what access conditions.
- Transparent reporting: what results are published, and in what format.
- Consequences: penalties, injunctions, procurement bans, or delayed deployment when thresholds are crossed.
Without those, “commitment” becomes a self-awarded badge.
Key Insight
Bletchley Park (Nov 2023): the moment “frontier risk” went mainstream
The summit’s signature deliverable was the Bletchley Declaration, published by the UK government. The document frames frontier AI risks as serious and calls for international cooperation. It also affirms familiar principles: AI should be safe, human-centric, trustworthy, and responsible. The wording helped set a baseline for the next year of policy debates.
The Chair’s Summary went a step further, noting discussion that voluntary commitments may need legal or regulatory footing. It also referenced the concept of independent evaluation and testing for future model iterations—an acknowledgment that self-attestation is not enough.
Those are real gains: naming the problem, normalizing cross-border coordination, and putting evaluation on the official agenda.
Bletchley made ‘frontier risk’ a shared diplomatic term—but it did not make anyone legally answerable.
— — TheMurrow Editorial
What Bletchley did not do
- an enforcement body with jurisdiction,
- penalties for noncompliance,
- standardized disclosure requirements,
- a defined process for what happens when a model fails evaluation.
Even the promising language around “independent evaluation” left key governance questions open: who qualifies as independent, what access evaluators receive, what results must be disclosed, and what corrective actions are required.
The summit helped set the agenda. It did not create a referee.
Key numbers that tell the story
- Document type matters: a declaration is not a regulation, even when it is signed by countries.
That distinction—symbolic agreement versus enforceable obligation—runs through every summit that followed.
Seoul (May 2024): more detail, still mostly voluntary
On paper, Seoul reads like an attempt to translate general principles into something closer to a governance program.
Even more notable were the industry-side Frontier AI Safety Commitments. Multiple firms signed onto commitments that include:
- conducting risk assessments,
- setting thresholds for “intolerable” severe risks,
- monitoring proximity to those thresholds,
- using internal and external evaluations.
That is the most concrete set of summit-era promises in the record provided.
The central ambiguity: thresholds, according to whom?
But the Seoul commitments (as presented in official summit materials) still leave crucial questions unanswered:
- Are the thresholds public, or internal?
- Are the evaluations truly independent, or selected and scoped by the company?
- How are “severe risks” defined and tested in practice?
- What happens if monitoring shows a model approaching a threshold?
- Who can verify that a firm is interpreting its own commitments in good faith?
The accountability gap here is not malicious; it is structural. Voluntary commitments can elevate standards, but they also allow firms to remain the final judge of their own compliance.
A voluntary ‘threshold’ is only as strong as the consequences that follow crossing it.
— — TheMurrow Editorial
A fair counterpoint: voluntary commitments can still move markets
The point is not that Seoul was empty. The point is that Seoul still largely asked the public to trust the actors being asked to slow down.
Editor’s Note
Paris (Feb 2025): the biggest stage, the clearest fragility
Those are headline figures—and they matter because summit legitimacy often rests on breadth: the more countries, companies, and NGOs in the room, the more the outcome seems to represent the world.
France also published a Statement on Inclusive and Sustainable AI for People and the Planet (Feb 11, 2025), framing AI governance in social and environmental terms.
The geopolitical story: the declaration not everyone signed
A summit can survive disagreement. It cannot escape the implications of non-signature by major AI powers: joint statements become easier to dismiss as symbolic, especially when the absent signatories are among the most important actors in the technology’s development and deployment.
Why Paris matters to accountability
1. Scale is not the same as enforceability. “100 concrete actions” sounds decisive, but readers should ask which actions are binding, which are funded, and which include verification.
2. Consensus is fragile. If the biggest players do not sign, declarations risk becoming moral positioning rather than governance.
Paris was not “a failure.” It was a reminder that summit governance competes with national security logic, domestic politics, and economic strategy.
Summits are not pointless—but they reward performative compliance
Those are not trivial achievements.
But summits also create perverse incentives. The format rewards:
- announcements over implementation,
- principles over penalties,
- photo opportunities over publication of test results.
Political agreements can be useful for agenda-setting, but they can also become commitment theatre—especially when companies can point to a signed pledge as proof of responsibility while disclosing little about real practices.
Case study: “independent evaluation” as a rhetorical shield
Yet summit outputs, as captured in the official documents cited, do not settle the hardest questions:
- independence from whom—financially, legally, operationally?
- access to what—model weights, APIs, system prompts, training data summaries?
- publication of what—methods, scores, failure cases, mitigations?
Without clarity, “independent evaluation” can become a talking point rather than a constraint.
What hard accountability would require—and why it’s politically difficult
Hard accountability would require at least three shifts.
1) From pledges to enforceable obligations
The Bletchley Chair’s Summary hinted at this by noting that voluntary commitments may need legal footing. That is the right direction—and also where the political cost begins.
2) From “credible evaluations” to standardized, auditable regimes
- shared protocols,
- minimum disclosure rules,
- evaluator independence protections,
- and audit trails.
Summits have encouraged evaluation; they have not, in these documents, locked in the governance mechanics.
3) From reputational incentives to real consequences
But consequences must be real enough that crossing a threshold is not just a line in a slide deck.
Key Insight
Practical takeaways: how to read the next summit like an adult
A checklist for spotting “commitment washing”
1. What is binding?
Is the statement political, voluntary, or enforceable under law?
2. Who verifies?
Are there named independent evaluators or institutions with authority and access?
3. What is published?
Are testing methods and results disclosed, or summarized selectively?
4. What happens if it fails?
Are there consequences—legal, financial, or operational—for noncompliance?
When a summit cannot answer these, the outputs may still be useful—but readers should discount them accordingly.
Commitment-washing checklist
- ✓What is binding?
- ✓Who verifies?
- ✓What is published?
- ✓What happens if it fails?
Implications for people who don’t attend summits
- Consumers get safety claims that may or may not be verified.
- Workers face adoption of systems whose risk tradeoffs were negotiated behind closed doors.
- Smaller firms inherit standards set by giants—sometimes without a seat at the table.
- Democracies risk substituting elite convenings for accountable public process.
The test of summit success is not how many leaders attend. The test is whether someone can be held responsible when things go wrong.
The next phase: fewer declarations, more receipts
It also produces a familiar failure mode: confusing agreement with enforcement.
Summits should not stop. But the format must evolve. Governments and companies should treat the next gathering less like a stage and more like a compliance negotiation—one that specifies who audits, what gets disclosed, and what consequences apply.
Otherwise, the world will keep getting “actions” without accountability—and the public will keep being asked to trust systems that even their builders struggle to fully explain.
Summits should not stop. But the format must evolve—less stagecraft, more compliance negotiation with audits, disclosure, and consequences.
— — TheMurrow Editorial
Frequently Asked Questions
What is the Bletchley Declaration, and why does it matter?
The Bletchley Declaration (Nov 2023) is a political statement from countries attending the UK’s AI Safety Summit. It matters because it helped establish shared recognition that frontier AI risks are serious and require cooperation. It does not, however, create enforceable obligations or penalties, so its effect is mainly agenda-setting rather than regulatory.
Did the Bletchley summit create enforceable AI safety rules?
No. The summit produced the Bletchley Declaration and a Chair’s Summary, both of which are not laws. The Chair’s Summary noted that voluntary commitments may eventually need legal or regulatory footing, but the summit outputs did not specify enforcement bodies, standardized disclosures, or consequences for failing safety expectations.
What did the Seoul summit add that Bletchley didn’t?
The AI Seoul Summit (May 2024) added more operational language. The Seoul Ministerial Statement promoted accountability, transparency, and credible external evaluations. Industry participants also signed Frontier AI Safety Commitments including risk assessments and “thresholds” for intolerable severe risks. Many elements remained voluntary and dependent on firms’ interpretations.
What are “Frontier AI Safety Commitments” and are they binding?
They are voluntary commitments signed by multiple firms at the Seoul summit. They include risk assessment, monitoring against risk thresholds, and the use of internal and external evaluations. They are not binding in the way a law is binding, and key details—such as whether thresholds are public and how independence is ensured—are not fully specified in summit materials.
What happened at the Paris AI Action Summit with the US and UK?
Reporting widely noted that the US and UK declined to sign the Paris summit declaration/communiqué. Publicly cited reasons included concerns about practical clarity on governance and national security (UK) and US skepticism toward certain governance framings. The non-signature matters because it weakens the perception of global consensus among major AI powers.
Are AI summits still useful if they don’t create enforcement?
Yes, within limits. Summits can establish shared vocabulary, normalize practices like evaluation, and catalyze institutions such as AI Safety Institutes. The risk is that, without independent verification and consequences for noncompliance, summits can become vehicles for reputational gains rather than measurable safety improvements.















