Domain Playbooks

Translate harness principles into frontend, platform, payments, and AI product teams.

Key takeaways

This chapter translates the five external cases into domain playbooks for frontend, platform, payments, and AI product teams.
Each domain has a different load-bearing element: browser QA for frontend, invariants for platform, audit trails for payments, eval sets for AI product.
Each team must answer the same questions differently: what is done, key verification, human approval zones, and long-lived artifacts.
Keep common foundations (entry doc, default verification commands, update logs, approval vocabulary) shared across teams.
A good team harness is not one universal template; it is a common foundation with domain playbooks layered on top.

After studying external examples, the practical question is:

What should our team design differently?

This chapter translates OpenAI, Anthropic, Toss, gstack, and revfactory into domain-specific harness design.

관찰 기반 모델

These scenarios are not direct copies of one company's documents. They are application models reconstructed from the five cases and each domain's failure modes.

Domain Differences

Team type	Biggest risk	Load-bearing harness element	Start page
Frontend	"The code is correct, but the screen is broken"	Browser QA, design rules, accessibility gate	`scenario-frontend-team`
Platform	Shared module or release-rule violation	Invariants, impact analysis, release gate	`scenario-platform-team`
Payments / settlement	Money, correctness, audit failure	Approval, reconciliation, audit trail	`scenario-payments-team`
AI product	Model drift and missing evaluation	Eval set, safety policy, canary loop	`scenario-ai-product-team`

What to Take from Each Case

Frontend teams need OpenAI-style observability and Anthropic-style QA separation.
Platform teams need OpenAI-style repo readability and Toss-style global/domain layers.
Payments teams need Toss-style HITL and operating gates.
AI product teams need Anthropic-style evaluation and revfactory-style domain-first design.

Anthropic's 2026 financial agent templates also show the domain view in packaged form. A domain template is not just a prompt bundle. It can include skills, connectors, subagents, per-tool permissions, credential vaults, audit logs, and approval flows.

Questions Each Team Must Answer

Question	Frontend	Platform	Payments	AI product
What is done?	Screen and interaction	Shared invariants	Correctness and audit	Offline and online eval pass
Key verification	Browser, a11y	Contract, release gate	Reconciliation, approval	Eval suite, telemetry
Human approval	User-impacting UI	Shared modules, release rules	Most risky changes	Model, policy, tool permission
Long-lived artifact	QA report, screenshots	ADR, invariants, release notes	Audit log, rollback plan	Eval report, prompt spec

Common vs Domain-Specific

Keep common:

AGENTS.md or equivalent entry doc;
default verification commands;
updates and verification logs;
shared approval vocabulary.

Split by domain:

done definition;
evaluator and QA shape;
human gate conditions;
operating metrics;
connector, MCP, and credential-vault boundaries;
whether domain templates ship as plugin, skill, or cookbook.

Conclusion

Harnesses differ not only by company, but by domain. A good team harness is not one universal template. It is a common foundation with domain playbooks on top.

Key takeaways

This chapter translates the five external cases into domain playbooks for frontend, platform, payments, and AI product teams.
Each domain has a different load-bearing element: browser QA for frontend, invariants for platform, audit trails for payments, eval sets for AI product.
Each team must answer the same questions differently: what is done, key verification, human approval zones, and long-lived artifacts.
Keep common foundations (entry doc, default verification commands, update logs, approval vocabulary) shared across teams.
A good team harness is not one universal template; it is a common foundation with domain playbooks layered on top.

After studying external examples, the practical question is:

What should our team design differently?

This chapter translates OpenAI, Anthropic, Toss, gstack, and revfactory into domain-specific harness design.

관찰 기반 모델

These scenarios are not direct copies of one company's documents. They are application models reconstructed from the five cases and each domain's failure modes.

Domain Differences

Team type	Biggest risk	Load-bearing harness element	Start page
Frontend	"The code is correct, but the screen is broken"	Browser QA, design rules, accessibility gate	`scenario-frontend-team`
Platform	Shared module or release-rule violation	Invariants, impact analysis, release gate	`scenario-platform-team`
Payments / settlement	Money, correctness, audit failure	Approval, reconciliation, audit trail	`scenario-payments-team`
AI product	Model drift and missing evaluation	Eval set, safety policy, canary loop	`scenario-ai-product-team`

What to Take from Each Case

Frontend teams need OpenAI-style observability and Anthropic-style QA separation.
Platform teams need OpenAI-style repo readability and Toss-style global/domain layers.
Payments teams need Toss-style HITL and operating gates.
AI product teams need Anthropic-style evaluation and revfactory-style domain-first design.

Questions Each Team Must Answer

Question	Frontend	Platform	Payments	AI product
What is done?	Screen and interaction	Shared invariants	Correctness and audit	Offline and online eval pass
Key verification	Browser, a11y	Contract, release gate	Reconciliation, approval	Eval suite, telemetry
Human approval	User-impacting UI	Shared modules, release rules	Most risky changes	Model, policy, tool permission
Long-lived artifact	QA report, screenshots	ADR, invariants, release notes	Audit log, rollback plan	Eval report, prompt spec

Common vs Domain-Specific

Keep common:

AGENTS.md or equivalent entry doc;
default verification commands;
updates and verification logs;
shared approval vocabulary.

Split by domain:

done definition;
evaluator and QA shape;
human gate conditions;
operating metrics;
connector, MCP, and credential-vault boundaries;
whether domain templates ship as plugin, skill, or cookbook.

Conclusion

Harnesses differ not only by company, but by domain. A good team harness is not one universal template. It is a common foundation with domain playbooks on top.

Domain Differences

What to Take from Each Case

Questions Each Team Must Answer

Common vs Domain-Specific

Conclusion

On This Page

Domain Playbooks

Domain Differences

What to Take from Each Case

Questions Each Team Must Answer

Common vs Domain-Specific

Conclusion

On This Page