NIST Just Took Comments on ‘AI Agent Security’ (Due March 9). Here’s the Mistake Everyone Makes: You Can’t “Patch” an Agent Like an App.
NIST didn’t ask about generic “genAI security.” It scoped the problem to agents that can take actions and create persistent external changes—where patching, monitoring, and rollback become a different kind of security discipline.

Key Points
- 1Track NIST’s pivot: CAISI scoped “AI agent security” to systems that act in the world and create persistent external state changes.
- 2Recognize the core mistake: you can’t secure agents like apps—updates can unpredictably shift model behavior, tools, permissions, and orchestration.
- 3Prioritize operational controls: least-privilege tool access, hostile-text assumptions, continuous monitoring, and rollback plans for unwanted action trajectories.
A clock quietly ran out in Washington
By 11:59 p.m. Eastern, the U.S. National Institute of Standards and Technology (NIST), through its Center for AI Standards and Innovation (CAISI), closed public comments on a deceptively technical prompt: how to secure AI agent systems. The agency had opened the floor with a Request for Information (RFI) published in the Federal Register on January 8, 2026 (notice 2026-00206), collecting submissions only through Regulations.gov under Docket No. NIST-2025-0035.
The RFI’s subject sounds narrow. It isn’t. NIST drew a bright line around a class of systems that do more than generate text or recommendations. These agents can plan and take autonomous actions that impact real-world systems or environments, producing persistent changes outside the agent system itself.
That scoping choice is the story. Because the biggest mistake policymakers, executives, and—often—engineers make is assuming an AI agent can be secured the way we secure “normal software”: ship it, patch it, and call it a day.
“The security problem changes the moment a model is allowed to act—not just to speak.”
— — TheMurrow Editorial
What closed on March 9—and why it matters
Several facts make the episode worth your attention:
- The RFI was published January 8, 2026, then kept open until March 9, 2026, 11:59 p.m. ET. That’s a two-month comment window in a fast-moving technical area.
- NIST specified it would not accept comments by postal mail, fax, or email; only Regulations.gov submissions counted.
- NIST warned that submissions would be posted publicly “without change or redaction,” urging commenters not to include sensitive or confidential material.
- The Federal Register notice listed Peter Cihon, Senior Advisor at CAISI, as the point of contact for questions about the RFI.
Those procedural details matter because they signal where the work is going next: toward guidance, standards, and shared definitions. NIST does not legislate, but it often supplies the scaffolding that regulators and procurement teams later use.
The deadline also matters because the security debate has started to shift from “model safety” in the abstract to something more operational: what happens when AI systems can touch files, send money, change configurations, or trigger industrial processes.
Editor’s Note
NIST’s key scoping decision: not “genAI security,” but agents that act
NIST scoped the inquiry to AI agent systems capable of taking actions that affect external state, meaning the system’s output can lead to persistent changes outside the agent system itself. A chatbot that drafts a memo is a different risk profile than an agent that can open tickets, deploy code, update access controls, or execute transactions.
In a January 2026 news release announcing the RFI, NIST emphasized that agents can “plan and take autonomous actions that impact real-world systems or environments.” The agency also distinguished between familiar software flaws—authentication bugs, memory vulnerabilities—and “distinct risks” created by coupling AI outputs with operational software functionality.
Why external state changes everything
- Probabilistic behavior (a model that may vary outputs)
- Tool use (APIs, command execution, database writes, workflow automation)
That combination changes not only what can go wrong, but how quickly problems propagate. A misinterpreted instruction can become a permission change. A poisoned data source can become an automated action. A subtle adversarial prompt can become a sequence of legitimate tool calls.
“An agent isn’t merely software with a bug. It’s software with initiative—and that’s a different security category.”
— — TheMurrow Editorial
The threats NIST put on the table—plainly, and with teeth
- Indirect prompt injection: when a model interacts with adversarial content embedded in data it reads—documents, websites, emails, tickets—and gets manipulated into taking actions.
- Data poisoning / insecure models: compromised training data, fine-tuning data, or model components that influence downstream behavior.
- Harmful actions even without adversarial input, such as specification gaming or misaligned objectives—where the agent follows the letter of a goal while violating the spirit, producing unsafe or unwanted outcomes.
These examples matter because they connect security to everyday workflows. Indirect prompt injection is not science fiction; it’s the natural result of giving an agent both eyes (access to content) and hands (access to tools).
Multi-agent systems: risk doesn’t add—it multiplies
Security teams know what happens when complexity grows: more interfaces, more trust boundaries, more places to hide malicious instructions, and more uncertainty about provenance.
The most responsible reading is not alarmism. It’s recognition that the attack surface expands when “language in” becomes “action out.”
The patching mistake: why updating agents isn’t like patching apps
NIST asked: “What are the methods, risks, and other considerations relevant for patching or updating AI agent systems throughout the lifecycle, as distinct from those affecting both traditional software systems and non-agentic AI?”
That question is a tell. The agency is signaling that patching agents is different—and that pretending otherwise is becoming a systemic risk.
What’s different about “patching” an agent?
Several moving parts can shift between versions:
- The underlying model (weights, system prompt, safety tuning)
- Tool definitions (what tools exist, what arguments they take)
- Permissions (what the agent is allowed to access)
- Orchestration logic (how plans are generated and executed)
Even if each change is reasonable on its own, the combined behavior can surprise you. Security teams must validate not just “does it run” but “does it act safely under messy conditions.”
Lifecycle security is not optional when systems can act
The practical implication for organizations is uncomfortable: a one-time security review is not serious governance for agents. Continuous monitoring, staged rollouts, and rollback plans move from “nice to have” to baseline.
“If an agent can change the world, then ‘update later’ becomes a security strategy—and a liability.”
— — TheMurrow Editorial
Key Insight
What NIST asked for: controls, assessments, and constrained environments
1. Unique security threats, including multi-agent differences
2. Security practices/controls, including model-level controls, processes, maturity
3. Assessing security, during development and post-deployment detection; how it aligns with traditional infosec and supply chain approaches
4. Constraining/monitoring the deployment environment, including limiting access, rollback/negation of unwanted action trajectories, monitoring, and legal/privacy issues
5. Additional considerations, including tools/guidelines/research needs and government collaboration
This structure matters because it pushes the conversation beyond abstract “AI safety” and into operational security engineering.
Constrain the environment, not just the model
- Limit access: least privilege for tools, data, and systems
- Monitor actions: detect anomalous sequences of tool calls
- Enable undo/rollback/negations: recover from “unwanted action trajectories”
- Address legal/privacy: monitoring can collide with confidentiality, labor rules, and data protection expectations
The “rollback” point is particularly revealing. Traditional software security assumes you can revert a bad release. Agents can create consequences that aren’t so easily unwound: an email sent, a record updated, a permission granted, a transaction initiated.
Deployment environment controls NIST is pointing toward
- ✓Limit access with least privilege for tools, data, and systems
- ✓Monitor actions to spot anomalous tool-call sequences
- ✓Enable undo/rollback/negations for unwanted action trajectories
- ✓Address legal/privacy constraints created by monitoring
Real-world examples: where agent security becomes operational risk
Example 1: The “helpful” IT agent and indirect prompt injection
If the agent ingests a ticket comment containing adversarial instructions—an indirect prompt injection—it might treat those instructions as part of its task context. The vulnerability is not a buffer overflow. It’s misplaced trust in untrusted text.
Security takeaway: treat every external or user-controlled text source as potentially hostile when the agent has tool access.
Example 2: Data poisoning in the agent’s memory and retrieval systems
Security takeaway: the integrity of the agent’s knowledge pipeline becomes as important as the integrity of the model.
Example 3: Harm without an attacker—specification gaming in automation
Security takeaway: safety is not merely about blocking attackers; it’s also about writing goals and constraints that survive contact with reality.
Competing perspectives: innovation versus control, and where the tension really is
Some builders worry that heavy-handed controls will make agents brittle and useless—an automation system that can’t do anything meaningful. Others argue that without strong constraints, agent deployments will become a parade of preventable incidents, followed by backlash that harms the field.
Both perspectives deserve respect. The productive frame is not “security versus innovation,” but “what level of autonomy is justified by the evidence of control?”
Security teams want measurability
Builders want clear, implementable standards
Civil society and privacy advocates will focus on monitoring
Practical takeaways: what organizations should do now
Build security around “action,” not around “AI”
- Restrict tool permissions using least privilege
- Separate read access from write access
- Require approvals for high-impact actions (payments, permissions, deployments)
Treat untrusted text as untrusted code
- Filter and label sources
- Isolate high-risk inputs
- Log context that influenced actions
Plan for rollback and incident response
- Maintain audit trails of tool calls
- Stage rollouts and keep “kill switches”
- Define what “reversal” means for each action type
Make updates a governed process, not a scramble
- Version control models, prompts, tools, and policies
- Test behavior changes under adversarial and messy inputs
- Monitor post-deployment drift and anomalies
A minimal governance loop for agent deployments
- 1.Define tool permissions and approval gates by action impact
- 2.Instrument logging for prompts, context sources, and tool calls
- 3.Test against indirect prompt injection and messy, real inputs before rollout
- 4.Deploy in stages with monitoring, kill switches, and rollback plans
- 5.Review post-deployment anomalies and govern updates as controlled releases
A deadline passed. The hard part starts now.
NIST, through CAISI, has asked the public to help define the threat models, the controls, the assessment methods, and the deployment constraints that will shape this next phase. The most telling detail is the one most people would skim past: NIST is asking how to patch agents differently, because it already suspects what many deployments are about to learn the hard way.
The systems we’re building can do things. Security doctrine that treats them as chatbots is not doctrine. It’s wishful thinking.
Frequently Asked Questions
What exactly closed on March 9, 2026?
NIST’s CAISI closed the public comment period for its Request for Information on securing AI agent systems. The deadline was March 9, 2026 at 11:59 p.m. ET, after publication on January 8, 2026 (Federal Register notice 2026-00206). Comments were submitted via Regulations.gov under Docket No. NIST-2025-0035.
What does NIST mean by an “AI agent system” here?
NIST scoped the RFI to agents capable of taking actions that affect external state, producing persistent changes outside the agent system itself—typically via tools or APIs, not content-only systems.
How is agent security different from regular software security?
Agent systems combine probabilistic model outputs with operational tool use, creating distinct risks beyond typical vulnerability classes. NIST highlighted indirect prompt injection, data poisoning, and harmful actions without an attacker (e.g., specification gaming).
What is indirect prompt injection, and why does NIST care?
Indirect prompt injection occurs when an agent reads adversarial instructions embedded in content (web pages, emails, documents) and follows them while using tools. NIST cited it because agents ingest untrusted text and can take real-world actions based on that context.
Why did NIST emphasize patching and updating agents?
NIST explicitly asked how patching AI agent systems differs across the lifecycle versus traditional software and non-agentic AI, because updating models, tools, permissions, or orchestration can change behavior in hard-to-predict ways—requiring stronger testing, monitoring, and rollback planning.
What should organizations deploying agents do while NIST develops guidance?
Constrain agent permissions (least privilege), treat untrusted text as hostile input, monitor tool actions, and build rollback/incident response procedures—since deployment environment controls are central to real-world agent security.















