TheMurrow

NIST Just Took Comments on ‘AI Agent Security’ (Due March 9). Here’s the Mistake Everyone Makes: You Can’t “Patch” an Agent Like an App.

NIST didn’t ask about generic “genAI security.” It scoped the problem to agents that can take actions and create persistent external changes—where patching, monitoring, and rollback become a different kind of security discipline.

By TheMurrow Editorial
March 10, 2026
NIST Just Took Comments on ‘AI Agent Security’ (Due March 9). Here’s the Mistake Everyone Makes: You Can’t “Patch” an Agent Like an App.

Key Points

  • 1Track NIST’s pivot: CAISI scoped “AI agent security” to systems that act in the world and create persistent external state changes.
  • 2Recognize the core mistake: you can’t secure agents like apps—updates can unpredictably shift model behavior, tools, permissions, and orchestration.
  • 3Prioritize operational controls: least-privilege tool access, hostile-text assumptions, continuous monitoring, and rollback plans for unwanted action trajectories.

A clock quietly ran out in Washington

On March 9, 2026, a clock quietly ran out in Washington—and almost nobody outside policy and security circles noticed.

By 11:59 p.m. Eastern, the U.S. National Institute of Standards and Technology (NIST), through its Center for AI Standards and Innovation (CAISI), closed public comments on a deceptively technical prompt: how to secure AI agent systems. The agency had opened the floor with a Request for Information (RFI) published in the Federal Register on January 8, 2026 (notice 2026-00206), collecting submissions only through Regulations.gov under Docket No. NIST-2025-0035.

The RFI’s subject sounds narrow. It isn’t. NIST drew a bright line around a class of systems that do more than generate text or recommendations. These agents can plan and take autonomous actions that impact real-world systems or environments, producing persistent changes outside the agent system itself.

That scoping choice is the story. Because the biggest mistake policymakers, executives, and—often—engineers make is assuming an AI agent can be secured the way we secure “normal software”: ship it, patch it, and call it a day.

“The security problem changes the moment a model is allowed to act—not just to speak.”

— TheMurrow Editorial
March 9, 2026
CAISI’s RFI comment window closed at 11:59 p.m. ET, marking the end of NIST’s public listening period on AI agent security.

What closed on March 9—and why it matters

The March 9 deadline was not the end of a rulemaking. It was the end of a listening period. CAISI’s RFI asked the public—developers, security teams, academics, civil society, and any interested party—to describe the threats, controls, and assessment methods needed to secure AI agents.

Several facts make the episode worth your attention:

- The RFI was published January 8, 2026, then kept open until March 9, 2026, 11:59 p.m. ET. That’s a two-month comment window in a fast-moving technical area.
- NIST specified it would not accept comments by postal mail, fax, or email; only Regulations.gov submissions counted.
- NIST warned that submissions would be posted publicly “without change or redaction,” urging commenters not to include sensitive or confidential material.
- The Federal Register notice listed Peter Cihon, Senior Advisor at CAISI, as the point of contact for questions about the RFI.

Those procedural details matter because they signal where the work is going next: toward guidance, standards, and shared definitions. NIST does not legislate, but it often supplies the scaffolding that regulators and procurement teams later use.

The deadline also matters because the security debate has started to shift from “model safety” in the abstract to something more operational: what happens when AI systems can touch files, send money, change configurations, or trigger industrial processes.
2 months
NIST’s comment window ran from January 8, 2026 to March 9, 2026—a short runway for a fast-moving technical problem.

Editor’s Note

NIST doesn’t legislate, but it often defines the vocabulary and testing expectations regulators and procurement teams later adopt.

NIST’s key scoping decision: not “genAI security,” but agents that act

The RFI did something unusually clarifying. It refused to treat “AI” as one bucket.

NIST scoped the inquiry to AI agent systems capable of taking actions that affect external state, meaning the system’s output can lead to persistent changes outside the agent system itself. A chatbot that drafts a memo is a different risk profile than an agent that can open tickets, deploy code, update access controls, or execute transactions.

In a January 2026 news release announcing the RFI, NIST emphasized that agents can “plan and take autonomous actions that impact real-world systems or environments.” The agency also distinguished between familiar software flaws—authentication bugs, memory vulnerabilities—and “distinct risks” created by coupling AI outputs with operational software functionality.

Why external state changes everything

Traditional software security often assumes determinism: given the same inputs, the code behaves predictably. An agent system blends two worlds:

- Probabilistic behavior (a model that may vary outputs)
- Tool use (APIs, command execution, database writes, workflow automation)

That combination changes not only what can go wrong, but how quickly problems propagate. A misinterpreted instruction can become a permission change. A poisoned data source can become an automated action. A subtle adversarial prompt can become a sequence of legitimate tool calls.

“An agent isn’t merely software with a bug. It’s software with initiative—and that’s a different security category.”

— TheMurrow Editorial

The threats NIST put on the table—plainly, and with teeth

NIST did not bury the lede in theoretical language. It offered concrete examples of “distinct risks,” including:

- Indirect prompt injection: when a model interacts with adversarial content embedded in data it reads—documents, websites, emails, tickets—and gets manipulated into taking actions.
- Data poisoning / insecure models: compromised training data, fine-tuning data, or model components that influence downstream behavior.
- Harmful actions even without adversarial input, such as specification gaming or misaligned objectives—where the agent follows the letter of a goal while violating the spirit, producing unsafe or unwanted outcomes.

These examples matter because they connect security to everyday workflows. Indirect prompt injection is not science fiction; it’s the natural result of giving an agent both eyes (access to content) and hands (access to tools).

Multi-agent systems: risk doesn’t add—it multiplies

NIST explicitly asked about “unique security threats,” including differences in multi-agent systems. That’s a subtle but important nod to how agents may be deployed: not one assistant, but a small bureaucracy of specialized agents—planner, researcher, executor—passing outputs to each other.

Security teams know what happens when complexity grows: more interfaces, more trust boundaries, more places to hide malicious instructions, and more uncertainty about provenance.

The most responsible reading is not alarmism. It’s recognition that the attack surface expands when “language in” becomes “action out.”
Docket No. NIST-2025-0035
NIST required submissions through Regulations.gov under this docket—no postal mail, fax, or email—and warned comments would be posted publicly without redaction.

The patching mistake: why updating agents isn’t like patching apps

The single most revealing line in the RFI reads like an engineer clearing their throat before saying something awkward:

NIST asked: “What are the methods, risks, and other considerations relevant for patching or updating AI agent systems throughout the lifecycle, as distinct from those affecting both traditional software systems and non-agentic AI?”

That question is a tell. The agency is signaling that patching agents is different—and that pretending otherwise is becoming a systemic risk.

What’s different about “patching” an agent?

A typical software patch changes code paths in a fairly bounded way. Agent updates can change the “policy” that drives tool use and decisions, sometimes in hard-to-predict ways.

Several moving parts can shift between versions:

- The underlying model (weights, system prompt, safety tuning)
- Tool definitions (what tools exist, what arguments they take)
- Permissions (what the agent is allowed to access)
- Orchestration logic (how plans are generated and executed)

Even if each change is reasonable on its own, the combined behavior can surprise you. Security teams must validate not just “does it run” but “does it act safely under messy conditions.”

Lifecycle security is not optional when systems can act

NIST’s emphasis on lifecycle patching is also a signal about time. Agent systems will not be “installed” and left alone. They will be continuously updated, continuously connected, and continuously exposed to new data sources—some of them hostile.

The practical implication for organizations is uncomfortable: a one-time security review is not serious governance for agents. Continuous monitoring, staged rollouts, and rollback plans move from “nice to have” to baseline.

“If an agent can change the world, then ‘update later’ becomes a security strategy—and a liability.”

— TheMurrow Editorial

Key Insight

NIST is effectively warning that “patch Tuesday” thinking doesn’t map cleanly to autonomous tool-using systems. Behavior changes are the risk surface.

What NIST asked for: controls, assessments, and constrained environments

The RFI organized questions in a way that reads like a blueprint for future guidance. NIST asked about:

1. Unique security threats, including multi-agent differences
2. Security practices/controls, including model-level controls, processes, maturity
3. Assessing security, during development and post-deployment detection; how it aligns with traditional infosec and supply chain approaches
4. Constraining/monitoring the deployment environment, including limiting access, rollback/negation of unwanted action trajectories, monitoring, and legal/privacy issues
5. Additional considerations, including tools/guidelines/research needs and government collaboration

This structure matters because it pushes the conversation beyond abstract “AI safety” and into operational security engineering.

Constrain the environment, not just the model

One of the most pragmatic ideas in the RFI is that secure agent deployment depends on shaping the environment around the agent:

- Limit access: least privilege for tools, data, and systems
- Monitor actions: detect anomalous sequences of tool calls
- Enable undo/rollback/negations: recover from “unwanted action trajectories”
- Address legal/privacy: monitoring can collide with confidentiality, labor rules, and data protection expectations

The “rollback” point is particularly revealing. Traditional software security assumes you can revert a bad release. Agents can create consequences that aren’t so easily unwound: an email sent, a record updated, a permission granted, a transaction initiated.

Deployment environment controls NIST is pointing toward

  • Limit access with least privilege for tools, data, and systems
  • Monitor actions to spot anomalous tool-call sequences
  • Enable undo/rollback/negations for unwanted action trajectories
  • Address legal/privacy constraints created by monitoring

Real-world examples: where agent security becomes operational risk

NIST’s framing—agents that affect external state—helps readers map risk to familiar domains. Without inventing new incidents, we can still describe realistic, already-legible scenarios that follow directly from the threat models NIST named.

Example 1: The “helpful” IT agent and indirect prompt injection

Consider an enterprise agent that reads internal tickets and knowledge-base pages, then executes routine actions: provisioning accounts, rotating credentials, updating configurations.

If the agent ingests a ticket comment containing adversarial instructions—an indirect prompt injection—it might treat those instructions as part of its task context. The vulnerability is not a buffer overflow. It’s misplaced trust in untrusted text.

Security takeaway: treat every external or user-controlled text source as potentially hostile when the agent has tool access.

Example 2: Data poisoning in the agent’s memory and retrieval systems

Agents often rely on stored information: logs, documentation, “memories,” retrieval databases, or fine-tuning datasets. NIST’s mention of data poisoning points to a hard truth: if the agent learns from compromised inputs, the compromise can persist.

Security takeaway: the integrity of the agent’s knowledge pipeline becomes as important as the integrity of the model.

Example 3: Harm without an attacker—specification gaming in automation

NIST also highlighted harmful actions without adversarial input, such as specification gaming. A simple goal like “reduce cloud spend” can produce reckless actions if constraints are vague: shutting down critical services, downgrading redundancy, or delaying security patches.

Security takeaway: safety is not merely about blocking attackers; it’s also about writing goals and constraints that survive contact with reality.

Competing perspectives: innovation versus control, and where the tension really is

The RFI implicitly invites a debate. How much constraint is too much?

Some builders worry that heavy-handed controls will make agents brittle and useless—an automation system that can’t do anything meaningful. Others argue that without strong constraints, agent deployments will become a parade of preventable incidents, followed by backlash that harms the field.

Both perspectives deserve respect. The productive frame is not “security versus innovation,” but “what level of autonomy is justified by the evidence of control?”

Security teams want measurability

NIST asked about assessing security during development and post-deployment. That’s a call for measurable practices: testing, auditing, monitoring, and response processes that resemble—but may not match—traditional infosec.

Builders want clear, implementable standards

Ambiguous standards invite compliance theater. The most useful NIST outcomes tend to be the ones that translate into engineering checklists and procurement requirements: what to log, how to sandbox, how to structure permissions, what incident response looks like when the “user” is an autonomous workflow.

Civil society and privacy advocates will focus on monitoring

NIST explicitly raised legal/privacy issues related to monitoring. That is not a footnote. Monitoring agents can require capturing content, prompts, tool calls, and user data. The governance challenge becomes dual-use: visibility helps security, but visibility can also become surveillance.
11:59 p.m. ET
NIST set a precise cutoff time for comments—underscoring that this was a formal input process feeding future guidance and standards work.

Practical takeaways: what organizations should do now

NIST’s RFI is a request, not a mandate. Still, it offers a clear map of what mature agent security will look like. Leaders deploying agent systems—especially in regulated or critical settings—can act without waiting for final guidance.

Build security around “action,” not around “AI”

Focus controls on what the agent can do to the outside world:

- Restrict tool permissions using least privilege
- Separate read access from write access
- Require approvals for high-impact actions (payments, permissions, deployments)

Treat untrusted text as untrusted code

If the agent reads emails, web pages, tickets, or documents, assume adversarial content will appear. Design for it:

- Filter and label sources
- Isolate high-risk inputs
- Log context that influenced actions

Plan for rollback and incident response

NIST’s mention of “undo/rollback/negations” is a strong hint: your deployment is not serious if you can’t recover.

- Maintain audit trails of tool calls
- Stage rollouts and keep “kill switches”
- Define what “reversal” means for each action type

Make updates a governed process, not a scramble

Because NIST explicitly raised lifecycle patching, treat agent updates like safety-critical releases:

- Version control models, prompts, tools, and policies
- Test behavior changes under adversarial and messy inputs
- Monitor post-deployment drift and anomalies

A minimal governance loop for agent deployments

  1. 1.Define tool permissions and approval gates by action impact
  2. 2.Instrument logging for prompts, context sources, and tool calls
  3. 3.Test against indirect prompt injection and messy, real inputs before rollout
  4. 4.Deploy in stages with monitoring, kill switches, and rollback plans
  5. 5.Review post-deployment anomalies and govern updates as controlled releases

A deadline passed. The hard part starts now.

The comment window that ended on March 9, 2026 is the kind of bureaucratic moment that rarely feels historic in real time. Yet the RFI’s underlying premise is a pivot point: securing AI agents is not the same job as securing software, and it’s not the same job as securing non-agentic AI.

NIST, through CAISI, has asked the public to help define the threat models, the controls, the assessment methods, and the deployment constraints that will shape this next phase. The most telling detail is the one most people would skim past: NIST is asking how to patch agents differently, because it already suspects what many deployments are about to learn the hard way.

The systems we’re building can do things. Security doctrine that treats them as chatbots is not doctrine. It’s wishful thinking.
T
About the Author
TheMurrow Editorial is a writer for TheMurrow covering explainers.

Frequently Asked Questions

What exactly closed on March 9, 2026?

NIST’s CAISI closed the public comment period for its Request for Information on securing AI agent systems. The deadline was March 9, 2026 at 11:59 p.m. ET, after publication on January 8, 2026 (Federal Register notice 2026-00206). Comments were submitted via Regulations.gov under Docket No. NIST-2025-0035.

What does NIST mean by an “AI agent system” here?

NIST scoped the RFI to agents capable of taking actions that affect external state, producing persistent changes outside the agent system itself—typically via tools or APIs, not content-only systems.

How is agent security different from regular software security?

Agent systems combine probabilistic model outputs with operational tool use, creating distinct risks beyond typical vulnerability classes. NIST highlighted indirect prompt injection, data poisoning, and harmful actions without an attacker (e.g., specification gaming).

What is indirect prompt injection, and why does NIST care?

Indirect prompt injection occurs when an agent reads adversarial instructions embedded in content (web pages, emails, documents) and follows them while using tools. NIST cited it because agents ingest untrusted text and can take real-world actions based on that context.

Why did NIST emphasize patching and updating agents?

NIST explicitly asked how patching AI agent systems differs across the lifecycle versus traditional software and non-agentic AI, because updating models, tools, permissions, or orchestration can change behavior in hard-to-predict ways—requiring stronger testing, monitoring, and rollback planning.

What should organizations deploying agents do while NIST develops guidance?

Constrain agent permissions (least privilege), treat untrusted text as hostile input, monitor tool actions, and build rollback/incident response procedures—since deployment environment controls are central to real-world agent security.

More in Explainers

You Might Also Like