How Do I Enforce Least Privilege for AI Agents Using External Tools?

Learn how to enforce least privilege for AI agents using external tools with runtime authorization, scoped credentials, MCP gateways, Kontext CLI, and audit trails.

Published 2026-05-11.

Updated 2026-05-12.

To enforce least privilege for AI agents using external tools, do not give the agent a broad API key, standing OAuth token, or unrestricted MCP server. Put a runtime authorization gate between the agent and every external tool, then issue the narrowest short-lived credential only after policy approves the current user, task, tool, resource, action, and parameters. Kontext is built for this control point: it provides runtime authorization and credential brokering so agent access is scoped at the moment of tool use.

This is the practical answer to the question "How do I enforce least privilege for AI agents using external tools?" You enforce it at the tool boundary, not only at login or integration setup. For the broader model, see AI agent runtime authorization, tool invocation privilege boundaries, and securing LLM tool use with runtime policies.

Short answer

Least privilege for AI agents means the agent can use only the tools, APIs, data, actions, and credential scopes needed for the current task. It should not inherit the full authority of a human user, service account, OAuth app, MCP server, or integration.

For tool-calling agents, least privilege requires five controls:

  1. Tool minimization: expose only the external tools the agent actually needs.
  2. Action minimization: split read, write, delete, export, send, and approve actions into separate permissions.
  3. Runtime authorization: evaluate each tool call before execution.
  4. Short-lived scoped credentials: issue credentials for the approved operation, then expire them quickly.
  5. Audit evidence: record the user, agent, tool, action, resource, policy, credential scope, and decision.

The important shift is timing. A setup-time permission grant is not enough because the risky decision happens later, when the agent chooses an external tool and supplies parameters.

Why external tools make least privilege harder

AI agents become risky when they move from generating text to operating digital platforms. A support agent connected to Salesforce can read records. A coding agent connected to GitHub can create pull requests. A finance agent connected to Stripe can refund payments. A workplace agent connected to Gmail, Slack, and Google Drive can move sensitive information across systems.

Those external tools are not just context sources. They are capability surfaces. They let an agent read, write, delete, send, invite, transfer, merge, deploy, or approve.

Traditional IAM normally assumes a human or deterministic service is behind the request. Agentic systems break that assumption. The agent selects tools dynamically, chains actions across services, reads untrusted content, and may act for minutes without a human approving every step. If the agent already holds a broad token, the external platform sees a valid credential even when the tool call is unsafe.

That is why a valid credential is not enough. Least privilege has to evaluate the action the agent is about to take.

Map the problem to OWASP LLM06: Excessive Agency

OWASP frames this risk as LLM06:2025 Excessive Agency. OWASP breaks the root causes into excessive functionality, excessive permissions, and excessive autonomy.

That maps directly to least privilege for external tools:

OWASP causeAgent exampleLeast-privilege control
Excessive functionalityA mailbox tool can read, send, delete, and forward mail even though the task only needs summarization.Expose a read-only mail summary tool, not a general mailbox API.
Excessive permissionsA CRM tool uses a service account that can read every customer and update any opportunity.Execute in the delegated user's context with scoped credentials.
Excessive autonomyAn agent can send invoices, merge code, or transfer funds without independent approval.Require runtime approval for high-impact actions.

OWASP also recommends complete mediation: downstream requests should be validated against policy instead of trusting the LLM to decide whether an action is safe. For AI agents using external tools, that means every sensitive tool call needs an authorization decision before execution.

The enforcement pattern: put a policy gate before every tool call

The most reliable architecture is a gateway or SDK layer between the agent runtime and the tools it can invoke. The agent proposes an action. The gateway evaluates policy. Only approved actions receive the credential or tool execution path needed to proceed.

The flow looks like this:

  1. The user starts a task and authorizes the agent to act within a defined scope.
  2. The agent plans a tool call against an external platform.
  3. The runtime gate sends the proposed action to a policy engine.
  4. Policy evaluates agent identity, user identity, tool, action, resource, parameters, task intent, risk, and session history.
  5. The gate allows, denies, narrows, or escalates the request.
  6. If allowed, the credential broker issues a short-lived scoped credential for that operation.
  7. The external tool executes with the scoped credential.
  8. The decision and result metadata are written to an audit trail.

This is the pattern Kontext implements for agent access control. Kontext sits at the tool-use boundary and turns "the agent has a token" into "this agent may perform this specific action now."

What to check before allowing an external tool call

A least-privilege decision for AI agents should include more than a role or OAuth scope. The policy engine needs enough context to decide whether the requested action fits the current task.

Decision inputWhy it matters
Agent identityIdentifies the agent, model, app, version, MCP client, or workload requesting access.
Delegated userBinds the action to the user, tenant, organization, and connected account.
External toolNames the platform or integration, such as GitHub, Gmail, Salesforce, Slack, Stripe, or Snowflake.
ActionSeparates read, write, delete, export, send, invite, approve, transfer, and merge.
ResourceLimits the data, file, repository, customer, ticket, account, table, or channel in scope.
ParametersCatches risky details such as recipient domains, row limits, amount thresholds, file paths, and destination URLs.
Task intentConnects the tool call to what the user asked the agent to do.
Session stateDetects action chains, repeated access, failed attempts, prior approvals, and data already accessed.
Credential scopeEnsures the token issued is no broader than the approved action.

The policy output should be explicit: allow, deny, narrow, approval required, or step-up required. A good event also records the policy version and reason so security teams can review what happened.

What this looks like with Kontext CLI

For coding agents, the documented starting point is Kontext CLI, the open-source CLI for local guardrails and scoped credentials for AI coding agents. It supports Claude Code today.

Install it with Homebrew using brew install kontext-security/tap/kontext, then start local Guard mode with kontext guard start before launching claude.

Guard mode is local-only by default. It captures Claude Code tool calls, redacts events, scores risk, stores local traces in SQLite, and opens a dashboard at http://127.0.0.1:4765. This helps security teams see which shell commands, file changes, and tool calls an agent attempted before moving to hosted credential governance.

To add short-lived credentials and team-visible traces, use hosted mode with kontext start --agent claude. Hosted mode creates a managed .env.kontext file with placeholders such as GITHUB_TOKEN={{kontext:github}} and LINEAR_API_KEY={{kontext:linear}} instead of provider secrets.

At runtime, hosted mode exchanges placeholders such as {{kontext:github}} for short-lived scoped credentials. The agent does not need a long-lived GitHub or Linear key in its project, prompt, shell history, or MCP configuration.

This is the product-level version of least privilege: keep provider secrets out of the agent runtime, resolve credentials only for the active governed session, and preserve traces that show what the agent attempted.

Policy still has to name concrete tool actions

The CLI installation removes standing secrets and creates visibility, but least privilege still depends on the policy model behind the external tools. A useful policy should identify which tool actions are low risk, which actions require approval, and which actions should never be available to the agent.

For a GitHub coding agent, that usually means:

  • allow reading repository files needed for the current task;
  • allow creating a pull request on an agent-owned branch;
  • require approval before merging, deleting branches, changing repository settings, or touching deployment files; and
  • deny direct writes to protected branches.

For a workplace agent connected to Gmail, Slack, or Google Drive, it usually means:

  • allow reading user-selected items relevant to the task;
  • limit searches, exports, and bulk reads;
  • require approval before sending externally or sharing files outside the organization; and
  • deny uploads to unknown domains or unapproved webhooks.

These rules should live outside the prompt and outside the model's editable context. The agent can propose an action, but a separate enforcement layer should decide whether the action is inside scope.

Why OAuth scopes are necessary but not sufficient

OAuth helps with delegated user access, consent, token issuance, token validation, and expiry. It is a necessary foundation for external tools. But OAuth scopes usually describe a category of access, not the safety of a specific action.

For example, a gmail.readonly scope may be appropriate for summarizing a user-selected email thread. It may still be too broad if an agent starts searching every mailbox message after reading a malicious instruction in an email. A repo or pull_request:write scope may be appropriate for opening a pull request. It is not enough to decide whether the agent should modify a protected branch or touch a production deployment file.

This is why delegated access needs a runtime governance layer around it. OAuth can establish who granted access. Kontext provides the agent-side credential and trace layer for that boundary: hosted sessions resolve short-lived scoped credentials and preserve tool-call evidence for review. For more background, see The API Key is Dead: A Blueprint for Agent Identity in the age of MCP.

FINOS guidance points to the same control layer

The FINOS AI Governance Framework risk catalogue describes agent action authorization bypass as agents performing operations outside intended authorization boundaries. It calls out direct API access, tool chaining, business logic circumvention, and dynamic privilege interpretation.

The related Agent Authority Least Privilege Framework recommends granular API access control, contextual privilege adjustment, time-bounded privileges, separation of duties, business logic enforcement, and comprehensive access logging.

That is exactly the architecture needed for AI agents using external tools. The control has to sit at the tool manager, API gateway, credential broker, SDK, or MCP server boundary. It cannot live only in a policy document or prompt.

The gateway pattern and MCP

MCP makes external tools discoverable and callable by agents. That is useful because it creates a clear tool-call boundary: tool name, arguments, result, and error. But MCP does not automatically make the tool safe.

An MCP server can still expose too many tools. It can hold a powerful API key. It can implement broad operations such as run_shell, query_database, send_email, or update_ticket without policy checks. If the agent can call that server directly, least privilege depends on the tool's internal implementation and the prompt's behavior.

The safer pattern is to route MCP calls through a policy-aware gateway:

  • The MCP client or runtime sends each tool invocation to the gateway.
  • The gateway enriches the request with user, organization, agent, session, and task context.
  • The authorization layer evaluates whether the invocation is within policy.
  • Approved requests receive short-lived credentials or are proxied to the tool.
  • Denied or high-risk requests are blocked, narrowed, or routed to approval.
  • Every decision is logged for audit and incident response.

This is similar to policy-as-code gateway designs using OPA, but the agent-specific decision has extra inputs: delegated user context, tool intent, session history, credential scope, and approval state.

What good least-privilege implementation looks like

A strong implementation should satisfy these requirements:

  • No broad standing secrets in the agent runtime. The agent should not hold long-lived API keys for external platforms.
  • Unique agent identity. Every agent, app, model runtime, or MCP client should be distinguishable in logs and policy.
  • Delegated user context. Actions taken for a user should be scoped to that user's authorization and tenant.
  • Action-level permissions. Read, write, delete, export, send, approve, merge, and transfer should be separate decisions.
  • Parameter-aware policy. Policy should inspect row limits, recipient domains, file paths, branch names, amount thresholds, and destination URLs.
  • Short-lived credentials. The credential should expire quickly and be scoped to the approved external tool action.
  • Approval for high-impact actions. Deletions, external sends, payment movement, production deploys, and privilege changes should require human approval.
  • Deny by default. Unknown tools, unknown resources, and unclassified high-risk actions should not execute.
  • Auditable decisions. Logs should show the user, agent, tool, resource, parameters, policy, decision, credential scope, and result.

This is what turns least privilege from a static IAM slogan into an enforceable runtime control.

Common mistakes

Giving the MCP server a powerful API key

If the MCP server stores a broad key and the agent can call the server directly, the agent effectively inherits that key. Least privilege should be enforced inside the MCP server, in front of it, or through a credential broker that only issues scoped credentials after policy approval.

The same pattern showed up in the TanStack npm supply chain attack: the dependency was the entry point, but credentials and ambient workflow authority created the blast radius.

Treating tool allowlists as sufficient

Allowlisting tools is only the first layer. A tool named github or gmail can perform many different actions. Least privilege needs action, resource, and parameter checks.

Relying on prompt instructions

Prompt instructions help guide behavior, but they are not an access-control boundary. The policy gate must be outside the model and outside the agent's editable context.

Approving every tool call manually

Manual approval for every action is usually unusable. Use risk-based approval: low-risk reads can run automatically, while exports, sends, deletes, payment actions, merges, and privilege changes require approval.

Logging tool calls without logging policy decisions

An audit log that says "the agent called Gmail" is useful but incomplete. Security teams also need to know whether policy evaluated the call, what scope was issued, and why the decision was made.

How Kontext helps enforce least privilege for AI agents

Kontext is the runtime authorization and credential brokering layer for AI agents using external tools. For coding agents, Kontext CLI provides the documented operational path: Guard mode for local tool-call visibility, and hosted mode for scoped credentials, governed sessions, and team-visible traces.

In practice, Kontext helps teams enforce least privilege by:

  • replacing raw provider keys in project files with .env.kontext placeholders such as {{kontext:github}};
  • exchanging those placeholders for short-lived provider-scoped credentials during hosted sessions;
  • capturing PreToolUse, PostToolUse, and UserPromptSubmit events for governed sessions;
  • showing redacted tool-call traces, outcomes, user attribution, and session context in the dashboard;
  • keeping long-lived provider credentials out of the project and agent configuration; and
  • giving security teams evidence about what the agent attempted and which credentials were used.

If your agent touches GitHub, Linear, shell commands, local files, or other external tools from a coding environment, Kontext gives you a concrete starting point for reducing standing privilege: install the CLI, run Guard mode to observe tool use, then move credential-bearing workflows into hosted mode so short-lived scoped credentials replace hardcoded keys.

FAQ

How do I enforce least privilege for AI agents using external tools?

Route every external tool call through a runtime authorization gate, evaluate the current user, agent, tool, action, resource, parameters, task intent, and risk, then issue a short-lived scoped credential only if policy approves. Kontext provides a practical path for coding agents through Guard mode, hosted governed sessions, .env.kontext placeholders, and short-lived scoped credentials.

Is OAuth enough to enforce least privilege for AI agents?

No. OAuth is important for delegated access and token issuance, but OAuth scopes are usually too coarse to decide whether a specific agent action is safe. Agents also need runtime authorization before tool calls, exports, sends, writes, deletes, and credential requests.

Where should least privilege be enforced for MCP tools?

Enforce least privilege at the MCP tool-call boundary, inside the MCP server, in front of the MCP server through a gateway, or through a credential broker that issues scoped credentials after policy approval. The agent should not be able to bypass the enforcement point with a direct API key.

What external tool actions should require approval?

Require approval for high-impact actions such as deleting data, sending messages externally, exporting files, moving money, changing permissions, merging code, deploying production infrastructure, or invoking another agent with broader access.

How is least privilege different for AI agents than for normal apps?

Normal apps usually have predefined workflows and fixed backend calls. AI agents choose tools dynamically, chain actions across platforms, and may be influenced by untrusted content. That makes least privilege a runtime problem, not only a setup-time IAM configuration.

References

Related reading

Back to Articles