Ch13. Observability and Deployment

Operate Eve with OpenTelemetry, Workflow tags, deployment checklists, health checks, and production runbooks.

핵심 요약

Eve observability needs session, turn, step, tool, sandbox, subagent, and model-usage views.
OpenTelemetry, Workflow tags, and runtime hooks are complementary surfaces.
Vercel and self-host deployments both need artifact, health, session, and stream verification.

Durable agents cannot be operated with HTTP request logs alone. You need to see where a run is parked, what step failed, which tool ran, which sandbox backend was used, and how much model usage accumulated.

Three Observability Surfaces

Surface	Location	Use
Workflow run tags	framework emitted	Agent Runs and session tree
OpenTelemetry	`agent/instrumentation.ts`	export spans to observability backend
Runtime hooks	`agent/hooks/**`	audit, metrics, warehouse ingestion

instrumentation.ts

agent/instrumentation.ts

import { defineInstrumentation } from "eve/instrumentation";
import { registerOTel } from "@vercel/otel";

export default defineInstrumentation({
  setup: ({ agentName }) =>
    registerOTel({
      serviceName: agentName,
    }),
  recordInputs: false,
  recordOutputs: false,
});

Official docs note that inputs and outputs can be recorded by default. In sensitive environments, explicitly review recordInputs, recordOutputs, exporter destination, retention, and access.

Runtime Context Enrichment

export default defineInstrumentation({
  events: {
    "step.started"(input) {
      return {
        runtimeContext: {
          "tenant.id": input.session.auth.current?.attributes.tenantId ?? "unknown",
          "channel.kind": input.channel.kind,
        },
      };
    },
  },
});

Avoid secrets, PII, and unbounded cardinality.

Workflow Tags

Eve emits reserved $eve.* attributes for workflow runs.

Tag	Meaning
`$eve.type`	session, turn, subagent
`$eve.parent`	immediate parent session
`$eve.root`	root session
`$eve.subagent`	subagent node id
`$eve.trigger`	channel kind
`$eve.title`	first-message derived title
`$eve.model`	turn model id
`$eve.input_tokens`	cumulative input tokens
`$eve.output_tokens`	cumulative output tokens
`$eve.tool_count`	tool count

These are helpful but should not be the only audit source. Use hooks for mandatory audit ledgers.

Vercel Deployment

eve build
vercel deploy

On Vercel, Eve emits Vercel Build Output, Workflow runs on Vercel Workflow, and defaultBackend() selects Vercel Sandbox.

Smoke checks:

curl https://<deployment>/eve/v1/health
curl -X POST https://<deployment>/eve/v1/session \
  -H 'content-type: application/json' \
  -d '{"message":"Hello from production"}'
curl https://<deployment>/eve/v1/session/<sessionId>/stream

Self-host Deployment

eve build
PORT=3000 eve start --host 0.0.0.0

Concern	Standard
workflow state	persistent `.workflow-data`
model auth	AI Gateway key or direct provider key
route auth	replace Vercel OIDC with host-valid auth
sandbox	Docker, microsandbox, or custom backend
schedules	ensure Nitro scheduled tasks run
logs	process manager and log collector
TLS/routing	reverse proxy or platform

Build Artifact Review

Artifact	Check
`.eve/diagnostics.json`	no unexpected warnings/errors
`agent-discovery-manifest.json`	expected files only
`compiled-agent-manifest.json`	tools, connections, channels, schedules, subagents
`module-map.mjs`	compiled module resolution

Runtime Runbook

Symptom	Check
production 401	route auth and placeholder removal
stuck in `session.waiting`	approval, question, or OAuth pending
tool not visible	`eve info`, dynamic resolver event, disabled default
missing subagent result	child stream and parent proxy input request
sandbox command failed	backend, network policy, bootstrap
cost spike	token tags, tool count, compaction
trace missing	instrumentation setup and exporter

Production SLO Candidates

SLO	Measurement
session start success	`POST /session` 2xx ratio
turn completion	`turn.completed` / started
no failed step	`step.failed` rate
approval latency	`input.requested` to answer
model latency	`step.started` to `step.completed`
cost per task	tokens + tool infrastructure
sandbox setup latency	first sandbox use
eval pass rate	CI and scheduled evals

Deployment Checklist

Gate	Standard
build	`eve build` succeeds
diagnostics	`.eve/diagnostics.json` clean
auth	production route auth fail-closed
secrets	not present in artifacts or workspace
sandbox	backend and network policy explicit
eval	`eve eval --strict` passes
smoke	health, session, and stream checks pass
observability	OTel or hook audit works
rollback	model/prompt/tool rollback documented

핵심 요약

Eve observability needs session, turn, step, tool, sandbox, subagent, and model-usage views.
OpenTelemetry, Workflow tags, and runtime hooks are complementary surfaces.
Vercel and self-host deployments both need artifact, health, session, and stream verification.

Three Observability Surfaces

Surface	Location	Use
Workflow run tags	framework emitted	Agent Runs and session tree
OpenTelemetry	`agent/instrumentation.ts`	export spans to observability backend
Runtime hooks	`agent/hooks/**`	audit, metrics, warehouse ingestion

instrumentation.ts

agent/instrumentation.ts

import { defineInstrumentation } from "eve/instrumentation";
import { registerOTel } from "@vercel/otel";

export default defineInstrumentation({
  setup: ({ agentName }) =>
    registerOTel({
      serviceName: agentName,
    }),
  recordInputs: false,
  recordOutputs: false,
});

Official docs note that inputs and outputs can be recorded by default. In sensitive environments, explicitly review recordInputs, recordOutputs, exporter destination, retention, and access.

Runtime Context Enrichment

export default defineInstrumentation({
  events: {
    "step.started"(input) {
      return {
        runtimeContext: {
          "tenant.id": input.session.auth.current?.attributes.tenantId ?? "unknown",
          "channel.kind": input.channel.kind,
        },
      };
    },
  },
});

Avoid secrets, PII, and unbounded cardinality.

Workflow Tags

Eve emits reserved $eve.* attributes for workflow runs.

Tag	Meaning
`$eve.type`	session, turn, subagent
`$eve.parent`	immediate parent session
`$eve.root`	root session
`$eve.subagent`	subagent node id
`$eve.trigger`	channel kind
`$eve.title`	first-message derived title
`$eve.model`	turn model id
`$eve.input_tokens`	cumulative input tokens
`$eve.output_tokens`	cumulative output tokens
`$eve.tool_count`	tool count

These are helpful but should not be the only audit source. Use hooks for mandatory audit ledgers.

Vercel Deployment

eve build
vercel deploy

On Vercel, Eve emits Vercel Build Output, Workflow runs on Vercel Workflow, and defaultBackend() selects Vercel Sandbox.

Smoke checks:

curl https://<deployment>/eve/v1/health
curl -X POST https://<deployment>/eve/v1/session \
  -H 'content-type: application/json' \
  -d '{"message":"Hello from production"}'
curl https://<deployment>/eve/v1/session/<sessionId>/stream

Self-host Deployment

eve build
PORT=3000 eve start --host 0.0.0.0

Concern	Standard
workflow state	persistent `.workflow-data`
model auth	AI Gateway key or direct provider key
route auth	replace Vercel OIDC with host-valid auth
sandbox	Docker, microsandbox, or custom backend
schedules	ensure Nitro scheduled tasks run
logs	process manager and log collector
TLS/routing	reverse proxy or platform

Build Artifact Review

Artifact	Check
`.eve/diagnostics.json`	no unexpected warnings/errors
`agent-discovery-manifest.json`	expected files only
`compiled-agent-manifest.json`	tools, connections, channels, schedules, subagents
`module-map.mjs`	compiled module resolution

Runtime Runbook

Symptom	Check
production 401	route auth and placeholder removal
stuck in `session.waiting`	approval, question, or OAuth pending
tool not visible	`eve info`, dynamic resolver event, disabled default
missing subagent result	child stream and parent proxy input request
sandbox command failed	backend, network policy, bootstrap
cost spike	token tags, tool count, compaction
trace missing	instrumentation setup and exporter

Production SLO Candidates

SLO	Measurement
session start success	`POST /session` 2xx ratio
turn completion	`turn.completed` / started
no failed step	`step.failed` rate
approval latency	`input.requested` to answer
model latency	`step.started` to `step.completed`
cost per task	tokens + tool infrastructure
sandbox setup latency	first sandbox use
eval pass rate	CI and scheduled evals

Deployment Checklist

Gate	Standard
build	`eve build` succeeds
diagnostics	`.eve/diagnostics.json` clean
auth	production route auth fail-closed
secrets	not present in artifacts or workspace
sandbox	backend and network policy explicit
eval	`eve eval --strict` passes
smoke	health, session, and stream checks pass
observability	OTel or hook audit works
rollback	model/prompt/tool rollback documented

Three Observability Surfaces

instrumentation.ts

Runtime Context Enrichment

Workflow Tags

Vercel Deployment

Self-host Deployment

Build Artifact Review

Runtime Runbook

Production SLO Candidates

Deployment Checklist

On This Page

Ch13. Observability and Deployment

Three Observability Surfaces

instrumentation.ts

Runtime Context Enrichment

Workflow Tags

Vercel Deployment

Self-host Deployment

Build Artifact Review

Runtime Runbook

Production SLO Candidates

Deployment Checklist

On This Page