Headless and CI/CD
Run Claude Code in non-interactive workflows with clear inputs, outputs, and guardrails.
Key takeaways
claude -p(--print) runs Claude Code non-interactively as the Agent SDK via the CLI; it works best for narrow tasks with machine-readable or easily reviewable output.- Start scripted runs with
--bareso they skip auto-discovery of hooks, skills, MCP, andCLAUDE.mdand behave the same everywhere — Anthropic's recommended (and future-default) CI mode. - Shape output with
--output-format(text,json,stream-json) and force a schema with--json-schema; pipe input via stdin (capped at 10MB as of v2.1.128). - Recent v2.1.181~v2.1.190 builds improved structured-output determinism and non-interactive fallback behavior, but CI should still validate the schema and preserve raw output on failure.
- Guard CI with least-privilege flags:
--allowedTools,--permission-mode dontAsk,--max-turns, and--max-budget-usd; avoidbypassPermissionsunless the environment is fully trusted. - Background sessions (
--bg) and agent view manage long-running jobs, but slash commands like/runand/verifyare interactive-only and cannot be invoked in-pmode.
Headless Claude Code works best when the task is narrow and the expected output is machine-readable or easy to review.
In the official documentation this mode is now called "Run Claude Code programmatically." Adding the
-p (or --print) flag to any claude command runs it non-interactively, and Anthropic frames
this as using the Agent SDK via the CLI — the same agent loop, tools, and context management that
power interactive Claude Code. For full programmatic control with native message objects and tool
approval callbacks, use the Python or TypeScript Agent SDK packages instead.
Agent SDK credit (effective June 15, 2026)
Starting June 15, 2026, Agent SDK and claude -p usage on subscription plans draws from a separate
monthly Agent SDK credit, distinct from interactive usage limits. Plan CI budgets accordingly. API
key, Bedrock, Vertex, and Foundry usage is billed through those providers as usual.
Good Headless Tasks
- Generate a changelog summary from a known diff.
- Run a focused code review and return findings.
- Update a repetitive documentation section.
- Classify test failures with logs as input.
- Produce a patch in a temporary branch for human review.
Avoid headless execution for broad architectural work unless a human will review each step.
Invocation Contract
Define input, output, and permissions:
claude -p "Review this diff for locale routing regressions. Return JSON findings only."For scripts, prefer stable flags and structured output. Record the model, version, and command in CI logs so failures can be reproduced.
Start scripts in bare mode
By default claude -p loads the same context an interactive session would — hooks, skills, plugins,
MCP servers, auto memory, and CLAUDE.md from the working directory and ~/.claude. That makes
scripted results depend on whatever happens to be configured on the machine. Add --bare so the run
starts faster and produces the same result everywhere by skipping all of that auto-discovery:
claude --bare -p "Summarize this file" --allowedTools "Read"In bare mode Claude only has the Bash, file read, and file edit tools, and only flags you pass
explicitly take effect. Load context deliberately with --append-system-prompt /
--append-system-prompt-file, --settings, --mcp-config, --agents, or --plugin-dir /
--plugin-url. Bare mode also skips OAuth and keychain reads, so authentication must come from
ANTHROPIC_API_KEY or an apiKeyHelper in the JSON passed to --settings (Bedrock, Vertex, and
Foundry use their usual provider credentials).
Bare mode is the recommended CI default
Anthropic recommends --bare for scripted and SDK calls, and it will become the default for -p in
a future release. Adopting it now keeps CI behavior stable across that change.
Use --safe-mode for troubleshooting broken customization rather than for reproducible CI. Unlike
--bare, safe mode keeps authentication, model selection, built-in tools, and permissions working,
but disables CLAUDE.md, skills, plugins, hooks, MCP servers, custom commands and agents, output
styles, workflows, custom themes, keybindings, status line, file-suggestion commands, LSP servers,
and auto-memory. It is useful when Fable 5 fallback, hook behavior, or plugin loading differs from a
clean session.
Correlate runs with a session id
Use --session-id (which must be a valid UUID) when CI needs to correlate Claude Code output with a
job, PR, or retry id:
claude -p "Summarize release risk as JSON." \
--session-id "00000000-0000-4000-8000-000000000123" \
--output-format jsonFor multi-step jobs, continue or resume instead of starting fresh: --continue (-c) loads the most
recent conversation in the current directory, and --resume (-r) resumes a specific session by ID
or name. Capture the id from a prior JSON result to resume it later:
session_id=$(claude -p "Start a review" --output-format json | jq -r '.session_id')
claude -p "Continue that review" --resume "$session_id"Pipe data in, structure data out
Non-interactive mode reads stdin, so you can pipe input in and redirect output like any CLI tool. Piping a diff avoids needing Bash permission to read it:
git diff main | claude -p "you are a typo linter. report filename:line then the issue. return nothing else."Piped stdin cap
As of Claude Code v2.1.128, piped stdin is capped at 10MB. Exceeding the cap exits with a clear error and a non-zero status. For larger inputs, write to a file and reference the file path in the prompt.
Control the output shape with --output-format:
text(default): plain text.json: structured JSON with the text inresult, plussession_id, usage, andtotal_cost_usd(with a per-model cost breakdown) so callers can track spend per invocation.stream-json: newline-delimited JSON events for real-time streaming (use with--verbose, and--include-partial-messagesfor token deltas).
To force schema-conforming output, add --json-schema with a JSON Schema
definition; the structured result lands in the structured_output field:
claude -p "Extract the main function names from auth.py" \
--output-format json \
--json-schema '{"type":"object","properties":{"functions":{"type":"array","items":{"type":"string"}}},"required":["functions"]}'Current v2.1.181~v2.1.190 releases make schema output more deterministic in headless runs. Treat that as a
reliability improvement, not as a replacement for validation: check that structured_output exists,
validate it against your schema, and keep the raw JSON or stderr as a CI artifact when parsing fails.
Background sessions
Use background sessions when a job is long-running and should be attachable. Pass --bg to start a
session that goes straight to the background and returns immediately, printing the session's short ID
and the commands for managing it:
claude --bg --name "flaky-test-fix" "Investigate SettingsChangeDetector flakes"
claude agents # open agent view (interactive terminal)
claude agents --json # print live sessions as JSON for scripting
claude logs <id> # print a session's recent output
claude attach <id> # attach to a session in this terminal
claude stop <id> # stop a session (also: claude kill)--name (also -n) sets the display name shown in agent view; without it the name is generated from
the prompt. Manage sessions from the shell with claude agents, claude logs <id>,
claude attach <id>, claude stop <id>, claude respawn <id>, and claude rm <id>. Each session's
short ID is its directory name under ~/.claude/jobs/.
Claude Code v2.1.170 fixed a transcript persistence issue where sessions launched from the VS Code
integrated terminal, or shells inheriting Claude Code environment variables, could fail to appear in
--resume. If a teammate reports missing sessions from those environments, verify they are on
v2.1.170 or later before treating it as user error.
Use claude --bg --exec '<command>' when you want a shell command to appear in Agent View as a
PTY-backed job without invoking a model. Its captured output stays in memory (not written to disk) and
cleans up about five minutes after the command exits, so read it before then:
claude --bg --exec 'pytest -x'Agent view is a research preview
Agent view requires Claude Code v2.1.139 or later (claude agents --cwd requires v2.1.141). Before
editing files, a background session moves into an isolated git worktree under .claude/worktrees/;
set worktree.bgIsolation to "none" (v2.1.143+) to edit the working copy directly. Background
sessions run locally, consume your subscription quota per session, and are preserved across sleep but
stop on machine shutdown.
CI Guardrails
- Run in a clean checkout, and start with
--bareso the run does not pick up machine-local config. - Use least-privilege credentials. For unattended CI, generate a long-lived token with
claude setup-tokeninstead of relying on interactive OAuth. - Scope tools explicitly.
--allowedToolslists tools that run without prompting (using permission rule syntax, e.g."Bash(git diff *)"), and--toolsrestricts which built-in tools exist at all. - For a locked-down baseline, pass
--permission-mode dontAsk, which denies anything not in yourpermissions.allowrules or the read-only command set.acceptEditsauto-approves file writes plus common filesystem commands but still aborts on other shell or network calls unless allowed. AvoidbypassPermissions(--dangerously-skip-permissions) in CI unless the environment is fully trusted. - Bound the run with
--max-turns N(exits with an error at the limit) and--max-budget-usd N(stops spending past the cap). Both are print-mode only. - Block production writes unless explicitly approved.
- Store generated patches as artifacts.
- Require human review before merge.
- Fail closed when configuration cannot be loaded.
- Pin the model for reproducible automation instead of relying on moving aliases, and consider
--fallback-modelso a retired or overloaded default does not break the pipeline. - Set
CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1in locked-down environments where nonessential network traffic is not allowed. This is equivalent to settingDISABLE_AUTOUPDATER,DISABLE_BUG_COMMAND,DISABLE_ERROR_REPORTING, andDISABLE_TELEMETRYtogether. - For sandboxed CI, manage the newer
sandbox.credentialspolicy alongsidesandbox.failIfUnavailableand network allow/deny rules. Current v2.1.181~v2.1.190 releases also improve destructive-command approval and native Windows PowerShell sandbox behavior, so test both Bash and PowerShell paths when Windows runners are in scope. - On self-hosted runners, use the v2.1.169
post-sessionlifecycle hook when you need to snapshot uncommitted work or export logs after a Claude session ends but before the workspace is deleted. The child-process SIGTERM-to-SIGKILL window is configurable; the default remains five seconds.
When streaming, the system/init event reports the model, tools, MCP servers, and loaded plugins; its
plugin_errors field lets CI fail when a plugin did not load. A system/api_retry event is emitted
before a retryable request is retried, so you can surface retry progress.
Long-running MCP tools in headless jobs should set CLAUDE_CODE_MCP_TOOL_IDLE_TIMEOUT deliberately
instead of relying on the default. Recent releases also reduced misleading MCP status reporting, but
automation should still judge MCP health from command exit codes and logs rather than the interactive
/mcp view.
Run And Verify Skills
Current Claude Code includes the /run and /verify bundled skills, plus /run-skill-generator, for
app-level checks. /run and /verify infer the launch from your project type (CLI, server, TUI,
browser-driven) and from your README, package.json, or Makefile. They are useful when a change must
be observed in a running app rather than inferred from tests alone.
/run-skill-generator runs once per project (and again when the build or launch process changes): it
gets the app running from a clean environment, captures what worked, and commits it as a per-project
skill under .claude/skills/run-<name>/ so later runs follow the recorded recipe instead of
rediscovering it.
Slash commands are interactive-only
User-invoked skills and built-in commands such as /run, /verify, and /code-review are only
available in interactive mode. In -p (headless) mode they cannot be called as slash commands —
describe the task you want accomplished in the prompt instead. In CI, only rely on these after the
project-specific run skill can build and launch from a clean checkout.
Output Review
Headless output should answer:
- What files or commits were inspected?
- What exact issue was found?
- What evidence supports it?
- What command or test confirms the result?
If the output cannot support those questions, narrow the prompt or add structured reporting.