TL;DR
The class of AI coding tools that runs as a CLI or as an autonomous agent - Gemini CLI, Cursor's agent mode, Claude Code, Aider, Continue, Copilot Workspace, and the rest - is now part of your CI/CD threat model whether you put it there or not. Anything an LLM reads is potentially an instruction. Anything an LLM is allowed to do, an attacker can ask it to do. This post is about what that means on a Monday morning, what the patch surface actually looks like, and the small set of things to actually change in your runners and dev environments this week.
The right reaction is not "ban AI from dev." The right reaction is fewer permissions, narrower scopes, better sandboxing. Same playbook as every other privileged actor on the network.
What actually changed
For most of the previous decade, "AI in dev" meant code completion in an IDE. The tool produced text suggestions; a human accepted or rejected them; the human pressed Cmd-S; the human ran the tests. The AI's privilege was equivalent to a clipboard.
The current generation of tools is different. They:
- Read arbitrary files on the developer's machine (or a CI runner's checkout)
- Execute shell commands directly
- Open editor sessions and write code without per-keystroke approval
- Resolve and install dependencies
- Commit, push, open pull requests, and merge them
In CI specifically, an agent invoked by npm run ai:fix-this-bug or by an Issues / on-create workflow has, by default:
- The repo at HEAD
- Whatever secrets the workflow has (deploy tokens, npm/PyPI publishing creds, cloud OIDC role)
- Network egress to whatever the runner can reach
- A shell
That's not "AI as a clipboard." That's a privileged actor with a job description that includes "follow instructions in arbitrary text inputs." Welcome to the threat model.
The threat model in three lines
-
Anything an LLM reads is potentially an instruction. A README, a commit message, an issue title, a webpage it browses to look up an error, a stray comment in a vendored library - all are inputs the model will treat as text and, under the right framing, as guidance for what to do next.
-
Untrusted input is everywhere. Source-control history isn't trusted. Package registries aren't trusted. Issue comments from strangers aren't trusted. The contents of a third-party API response your code logs aren't trusted. Once the agent reads any of that, it is now operating on attacker-controllable input.
-
The agent has your permissions, not its own. When
npm publishruns from inside an agent task, the registry doesn't see the agent. It sees you. The cloud doesn't see "an LLM tried to assume this role." It sees the role assumption. Audit trails attribute every action to the human or service account the agent ran as.
Three lines. Read them like a checklist; if you can't articulate which of (1) (2) (3) you're protecting against on a given runner, you're not protecting.
Where the blast radius actually is
When teams hear "supply chain" they think of npm or PyPI. The blast radius is wider than that.
Inside the source repo, an agent can:
- Modify any file
- Push to any branch the runner has push access to (often: all branches, including
mainif branch protection is misconfigured) - Resolve and import new dependencies - including dependencies whose names look like typos of well-known ones
- Add or remove GitHub Actions workflows
- Read every secret exposed to the workflow
At the registry boundary, an agent can:
- Run
npm publish/pip upload/cargo publishif creds are present - Bump versions and ship them
- Mint new tokens via OIDC if your registry trusts your CI's OIDC issuer
At the deploy boundary, an agent can:
- Trigger production deploys via
vercel --prod,gh workflow run deploy.yml,aws ecs update-service,kubectl apply- anything the runner already has credentials for - Modify infrastructure-as-code in ways that aren't obviously malicious in a code review (a small
terraformchange to an IAM role can be a critical privilege escalation)
At the credential boundary, an agent can:
- Read environment variables (where most secrets live in CI)
- Read mounted secret files
- In the absence of
core.sshCommandconstraints or hardware token enforcement, sign git commits as you
The right mental model: an agent in CI has the same blast radius as a malicious commit by a developer with write access to that repo, except the malicious commit can be triggered by a stranger writing a sentence in an issue comment.
What to do Monday morning
A short list, in priority order. Each one is small, individually shippable, and reduces blast radius.
1. Audit which workflows invoke an agent
Grep for the obvious binaries. Examples:
git grep -E '(claude|cursor|gemini|aider|continue|copilot)' \
-- '.github/workflows/*' '*.yml' '*.yaml' 'package.json' 'Makefile' \
'justfile' 'taskfile.yml' 'docker*' 2>/dev/nullAnything that runs in CI and invokes an agent CLI is a workflow you need to look at. Do not let "we only run it on trusted PRs" be the answer - pull_request_target is famous as a footgun for this reason.
2. Cut what the agent's runner can reach
For every workflow that runs an agent, ask:
- Does it need write access to the repo? Most "AI fix" or "AI explain" use cases need read access only. Use
permissions: contents: read(or scopedpull-requests: writefor PR comment-only flows). Defaultpermissions: write-allis the wrong default. - Does it need any secrets? Most reading flows need none. If yours does, list them; remove the rest.
- Does it need network egress beyond api.github.com and the model provider? If yes, list the destinations. If you can't list them, you don't know what you're trusting.
3. Pin and sandbox the agent itself
Pin to a specific version. Don't npm install -g an agent at runtime; install at image-build time so the version is auditable. Run inside a container with a read-only root filesystem and a non-root user. If your agent doesn't tolerate that, file a bug with its vendor.
4. Don't let the agent decide its own scope
If an agent task says "ah, I should bump the version and publish", that's a problem regardless of whether the bump is correct. Either the human decides what gets published, or there's a deterministic non-AI gate (a release CI job, a tag-triggered workflow) between the agent's output and the registry.
The general principle: AI proposes, humans dispose. A diff and a PR is a fine thing for an agent to produce. A git push origin main is not.
5. Treat the agent like an untrusted contractor
You wouldn't give a freelancer's laptop your prod KMS keys for the duration of a one-day engagement. The same logic applies. Short-lived OIDC tokens scoped to one repo, one branch, one operation are the default to aim for. Long-lived PATs in repo secrets are the thing to retire.
6. Keep an audit trail you'd actually read
Every agent action should be attributable. The minimum:
- Log the input prompt (with secrets redacted)
- Log the model and model version
- Log the diff or commands the agent emitted
- Log who triggered the run
Storing these for 90 days is enough to catch most incidents. Storing nothing is the failure mode that turns a small issue into a forensic disaster.
The wrong reaction
A predictable response, especially from leadership-shaped people, is to ban AI tools in dev. Don't. It does three things, all bad:
- It stops a productivity gain that's already real for many teams.
- It pushes the use underground (developers will run agents on their personal machines and paste outputs into the work repo, which is worse from a supply-chain standpoint).
- It collapses the conversation about which permissions the agent should have into a binary "yes/no", when the actually-useful conversation is "which scopes."
The threat is not "AI exists in our toolchain." The threat is "we gave the agent too many permissions and didn't notice."
What we'd do differently next time
If you're standing this up from a clean slate today, the architecture worth aiming for is:
- One container image per agent, pinned, scanned, rebuilt nightly.
- One IAM role per agent task, scoped to the minimum it actually needs (read repo, write to branch matching
agent/*, no production access). - One audit log destination that retains 90 days minimum and is read by a human at least weekly.
- Branch protection that requires a human review on every agent-authored PR, with
required-status-checksthat the agent cannot itself trigger. - A monthly agent-permission review the same way you do a monthly access review for human contractors.
None of this requires a vendor's "AI security" product. It's the same hygiene you should already have for any privileged automation - applied with the new actor in mind.
Coda
The supply chain doesn't care whether the actor making the changes is a human or an LLM. The patches, the registry, the production deploy - all see the same write. Treat your agent like the privileged actor it is, give it the smallest set of permissions it can do its job with, and audit what it does.
The tools aren't going away. The question is whether your runners are ready.