AI Agent Risks and Automation Security

What Is an AI Agent?

An AI agent is an LLM that doesn’t just answer questions — it has access to tools and can perform actions. It reads emails, writes to databases, calls APIs, sends messages, manages files.

The difference from a chatbot: a chatbot responds with text. An agent acts.

And that’s where security risks multiply.

Why Are Agents Riskier Than Chatbots?

Chatbot without tools:

Attacker can extract information from context
Can bypass response rules
Impact: information leak, reputational damage

Agent with tools:

Attacker can trigger real actions in systems
Can exfiltrate data via external calls
Can modify, delete, or create records
Impact: direct damage to systems, data, finances

Main Risk Categories

1. Excessive Permissions

The agent has access to more tools and data than it needs for its task.

Example: A customer support agent has full CRM access including delete and edit capabilities — when it only needs read access.

Impact: Any successful attack has a much larger blast radius.

Mitigation: Principle of least privilege. Each agent gets only the minimum set of tools and permissions for its specific task.

2. Uncontrolled Chaining

The agent performs a series of steps autonomously. An error or manipulation in one step propagates to the next.

Example: Agent receives “process this order.” It reads data from an email (containing an injection payload) -> creates a CRM record with manipulated content -> sends a confirmation to the customer with altered text -> updates inventory.

Impact: Cascading failure across multiple systems. Difficult forensic analysis — where exactly did the problem start?

Mitigation: Checkpoint mechanisms. Critical actions require confirmation. Log every step.

3. Data Exfiltration via Tools

The agent has access to internal data AND tools with external reach (email, API calls, web requests).

Example: Agent processes internal documents and has Slack access. Indirect PI in a document instructs the agent to send key information to a specific channel or external webhook.

Impact: Leak of trade secrets, PII, internal strategies.

Mitigation: Separate agents into “readers” (internal data, no external reach) and “actors” (external reach, limited data access). Never both in one.

4. Privilege Escalation

The agent starts with limited permissions but gains more through manipulation.

Example: Agent has access to a tool API that allows “manage users.” An attacker via injection instructs the agent to add admin permissions or create a new account.

Impact: Complete system compromise.

Mitigation: Agent permissions must be hardcoded at the infrastructure level, not at the prompt level. The agent must not be able to modify its own permissions.

5. Confused Deputy Problem

The agent performs actions on behalf of a user but is manipulated by a third party. The system sees a legitimate user — it doesn’t see that an attacker is behind the action.

Example: A manager asks the AI agent to summarize a report. The report contains indirect PI. The agent sends an email on the manager’s behalf based on the injection. From the email system’s perspective, the manager sent the email.

Impact: Unauthorized actions with legitimate credentials. Difficult to prove the user didn’t initiate the action.

Mitigation: Actions with external impact require explicit user confirmation. Log not just “who” but also “why” — the entire decision chain.

6. Agentic Loops

The agent gets stuck in a loop — repeatedly performing an action, escalating an error, or generating nonsensical outputs.

Example: Agent gets task “resolve this ticket.” Writes a response to the customer. Customer doesn’t reply. Agent sends a follow-up. And another. And another. Or: agent hits an error, tries to fix it, the fix causes another error, and so on.

Impact: Spam, service degradation, unexpected costs (API calls, tokens).

Mitigation: Rate limits, maximum steps per task, circuit breaker, kill-switch.

AI Agent Risk Matrix

Factor	Low Risk	High Risk
Permissions	Read-only	Read + write + delete + send
Reach	Internal system only	Internal + external (email, API, web)
Autonomy	Every step approved by human	Fully autonomous chaining
Data	Public data	PII, trade secrets, credentials
Inputs	Trusted sources only	Emails, web, third-party documents
Monitoring	Complete logging + alerts	None or minimal logging

Key Takeaway

AI agent security is not solved at the prompt level. It’s solved at the architecture level:

What tools the agent has (and which it doesn’t)
What data it sees (and what it doesn’t)
When it may act alone (and when it must wait for a human)
What happens when it fails (and how we find out)