Verification Report

Link, consistency, source, and static validation report for the Harness Engineering handbook.

This document records structure, source, cross-link, and handbook app validation for the English Harness Engineering handbook.

Verification Baseline

2026-05-23

Scope

Item	Standard
Document structure	`meta.json` matches actual MDX files
Content consistency	Core claims and chapter flow do not conflict
External evidence	OpenAI, Anthropic, Toss, gstack, and revfactory/harness claims are checked
Cross-links	Related handbook links are valid
App validation	handbook registry, typecheck, and build pass

Method

Compared apps/handbook/content/books/en/harness-engineering/meta.json with the MDX file list.
Compared chapter claims against the source material.
Checked related links to LLMOps/AgentOps, Codex, Claude Code, orchestration, and documentation books.
Used OpenAI developer docs MCP and official OpenAI sources for OpenAI items.
Cross-checked Anthropic engineering/news sources and Claude Code / Managed Agents official docs.
Checked gstack and revfactory/harness README state and GitHub metadata during the Korean baseline update.
Ran static validation.

Result Summary

Item	Result
`meta.json` and MDX files	23 pages aligned
Structure flow	Pass
External evidence connection	Pass
Cross-links	Pass
`pnpm --filter handbook run check:books-registry`	Pass
`pnpm --filter handbook run typecheck`	Pass
`pnpm --filter handbook run build`	Pass

Core Sources

Source	Date	Use in this book
OpenAI, Harness Engineering	2026-02-11	agent-readable repo, short AGENTS.md, structured docs, observability, garbage collection
OpenAI, The next evolution of the Agents SDK	2026-04-15	model-native harness, native sandbox execution, MCP/skills/AGENTS.md/shell/apply_patch primitives
OpenAI API Changelog	2026-05-06 / 2026-05-19	TypeScript sandbox agents, open-source harness, Secure MCP Tunnel
OpenAI Developers plugin for Codex	2026-05-07	OpenAI Platform access, API key setup, troubleshooting as plugin surface
OpenAI, Work with Codex from anywhere	2026-05-14	mobile/remote connection, approvals, hooks, enterprise environment
OpenAI Agents SDK / Sandbox / Codex docs	Read baseline 2026-05-23	sandbox capability, hooks lifecycle, remote connections
Anthropic, Harness design for long-running application development	2026-03-24	planner/generator/evaluator and load-bearing scaffolding
Anthropic, Claude Code auto mode	2026-03-25	prompt-injection probe, transcript classifier, trust boundary, denial fallback
Claude Code permission / auto mode docs	Read baseline 2026-05-23	permission modes, protected paths, classifier order, trusted infrastructure
Anthropic, Scaling Managed Agents	2026-04-08	session/harness/sandbox split, durable event log, credential vault, MCP proxy
Claude Managed Agents overview / MCP connector docs	Read baseline 2026-05-23	agents, environments, sessions, events, MCP auth/vault separation
Anthropic, Agents for financial services	2026-05-05	domain templates, skills/connectors/subagents, per-tool permissions, audit log
Toss harness article	2026-02-26	frictionless harness, executable SSOT, domain layer, HITL
gstack README	Read baseline 2026-05-23	specialists, power tools, agent hosts, team mode, QA, checkpoint, learning
revfactory/harness README	Read baseline 2026-05-23	L3 Meta-Factory, Team-Architecture Factory, architecture patterns, A/B caveat

Synthesized Claims

Claim	Evidence basis
Harnesses are work-system design, not prompt tricks	OpenAI repo/observability + Anthropic evaluation + Toss system rollout
Generic harnesses are starting points	Toss domain layers + gstack make-it-yours posture + revfactory domain teams
Operations and cleanup are part of the harness	OpenAI entropy and doc gardening view
Harness primitives are being productized, but domain design remains	OpenAI primitives + gstack/revfactory domain workflows
Auto approval is a policy layer, not a human-review replacement	Anthropic auto mode classifier/trust-boundary structure
Long-running agent runtime should use durable session logs	Anthropic Managed Agents session/harness/sandbox split

Limitations

Scope

This report reflects the 2026-05-23 source baseline. External tools and docs can change quickly; interpretation changes should be recorded in updates.mdx.

Verification Report

Link, consistency, source, and static validation report for the Harness Engineering handbook.

This document records structure, source, cross-link, and handbook app validation for the English Harness Engineering handbook.

Verification Baseline

2026-05-23

Scope

Item	Standard
Document structure	`meta.json` matches actual MDX files
Content consistency	Core claims and chapter flow do not conflict
External evidence	OpenAI, Anthropic, Toss, gstack, and revfactory/harness claims are checked
Cross-links	Related handbook links are valid
App validation	handbook registry, typecheck, and build pass

Method

Compared apps/handbook/content/books/en/harness-engineering/meta.json with the MDX file list.
Compared chapter claims against the source material.
Checked related links to LLMOps/AgentOps, Codex, Claude Code, orchestration, and documentation books.
Used OpenAI developer docs MCP and official OpenAI sources for OpenAI items.
Cross-checked Anthropic engineering/news sources and Claude Code / Managed Agents official docs.
Checked gstack and revfactory/harness README state and GitHub metadata during the Korean baseline update.
Ran static validation.

Result Summary

Item	Result
`meta.json` and MDX files	23 pages aligned
Structure flow	Pass
External evidence connection	Pass
Cross-links	Pass
`pnpm --filter handbook run check:books-registry`	Pass
`pnpm --filter handbook run typecheck`	Pass
`pnpm --filter handbook run build`	Pass

Core Sources

Source	Date	Use in this book
OpenAI, Harness Engineering	2026-02-11	agent-readable repo, short AGENTS.md, structured docs, observability, garbage collection
OpenAI, The next evolution of the Agents SDK	2026-04-15	model-native harness, native sandbox execution, MCP/skills/AGENTS.md/shell/apply_patch primitives
OpenAI API Changelog	2026-05-06 / 2026-05-19	TypeScript sandbox agents, open-source harness, Secure MCP Tunnel
OpenAI Developers plugin for Codex	2026-05-07	OpenAI Platform access, API key setup, troubleshooting as plugin surface
OpenAI, Work with Codex from anywhere	2026-05-14	mobile/remote connection, approvals, hooks, enterprise environment
OpenAI Agents SDK / Sandbox / Codex docs	Read baseline 2026-05-23	sandbox capability, hooks lifecycle, remote connections
Anthropic, Harness design for long-running application development	2026-03-24	planner/generator/evaluator and load-bearing scaffolding
Anthropic, Claude Code auto mode	2026-03-25	prompt-injection probe, transcript classifier, trust boundary, denial fallback
Claude Code permission / auto mode docs	Read baseline 2026-05-23	permission modes, protected paths, classifier order, trusted infrastructure
Anthropic, Scaling Managed Agents	2026-04-08	session/harness/sandbox split, durable event log, credential vault, MCP proxy
Claude Managed Agents overview / MCP connector docs	Read baseline 2026-05-23	agents, environments, sessions, events, MCP auth/vault separation
Anthropic, Agents for financial services	2026-05-05	domain templates, skills/connectors/subagents, per-tool permissions, audit log
Toss harness article	2026-02-26	frictionless harness, executable SSOT, domain layer, HITL
gstack README	Read baseline 2026-05-23	specialists, power tools, agent hosts, team mode, QA, checkpoint, learning
revfactory/harness README	Read baseline 2026-05-23	L3 Meta-Factory, Team-Architecture Factory, architecture patterns, A/B caveat

Synthesized Claims

Claim	Evidence basis
Harnesses are work-system design, not prompt tricks	OpenAI repo/observability + Anthropic evaluation + Toss system rollout
Generic harnesses are starting points	Toss domain layers + gstack make-it-yours posture + revfactory domain teams
Operations and cleanup are part of the harness	OpenAI entropy and doc gardening view
Harness primitives are being productized, but domain design remains	OpenAI primitives + gstack/revfactory domain workflows
Auto approval is a policy layer, not a human-review replacement	Anthropic auto mode classifier/trust-boundary structure
Long-running agent runtime should use durable session logs	Anthropic Managed Agents session/harness/sandbox split

Limitations

Scope

This report reflects the 2026-05-23 source baseline. External tools and docs can change quickly; interpretation changes should be recorded in updates.mdx.

Scope

Method

Result Summary

Core Sources

Synthesized Claims

Limitations

On This Page

Verification Report

Scope

Method

Result Summary

Core Sources

Synthesized Claims

Limitations

On This Page