Verification Report

Structure, link, metric, and logic verification for LLMOps and AgentOps in Production

Verification Baseline

2026-05-18 (English edition translated from the Korean 4th verification baseline dated 2026-05-17)

Verification Scope

Page structure: meta.json declarations match actual files.
Formula and metric consistency: Unit cost, Error budget, Burn rate, SLI/SLO.
Cross-chapter flow: evaluation to release to observability to incident response.
External reference link reachability.

History Archive

2026-03-13 and 2026-03-26 verification records are separated into Verification Archive. This report focuses on the latest operating baseline and 4th verification.

Structure Verification

Item	Result
meta.json pages	12
MDX file count	12
Internal link errors	0
Missing/duplicate chapters	0

Logic Consistency

Check	Standard	Result
Cost formula consistency	Unit cost definition is consistent between Index and Ch6	Pass
Gate linkage	Ch3 evaluation criteria feed into Ch2 release gates	Pass
SLO to incident linkage	Ch5 Error budget links to Ch8 incident classification	Pass
Experiment safety	Ch7 decision formula does not conflict with Ch4 guardrails	Pass

Counterexample Scenarios

Scenario	Expected Behavior	Result
Model upgrade improves quality by +2% but raises cost by +20%	Cost gate holds release	Pass
Latency is normal but policy violation rate rises	Safety gate blocks first	Pass
SLO passes overall but a specific tenant fails more often	Tenant-segment metrics detect anomaly	Pass

External Link Check

Category	Link	Status
Google SRE Workbook	https://sre.google/workbook/table-of-contents/	Checked
OpenAI Agents SDK	https://developers.openai.com/api/docs/guides/agents	Checked
OpenAI Pricing	https://openai.com/api/pricing/	Checked
MCP Specification	https://modelcontextprotocol.io/specification/2025-11-25	Checked
A2A Specification	https://a2a-protocol.org/latest/specification/	Checked
OpenTelemetry GenAI	https://opentelemetry.io/docs/specs/semconv/gen-ai/	Checked
OWASP AOS	https://aos.owasp.org/aos/	Checked
OWASP MCP Top 10	https://owasp.org/www-project-mcp-top-10/	Checked
OWASP Agentic Skills Top 10	https://owasp.org/www-project-agentic-skills-top-10/	Checked
Claude Guardrails	https://docs.claude.com/en/docs/test-and-evaluate/strengthen-guardrails/handle-streaming-refusals	Checked
Anthropic Pricing	https://platform.claude.com/docs/en/about-claude/pricing	Checked
DeepSeek Pricing	https://api-docs.deepseek.com/quick_start/pricing/	Checked
LangSmith Fleet	https://www.langchain.com/blog/introducing-langsmith-fleet	Checked
Braintrust Loop	https://www.braintrust.dev/docs/loop	Checked
PagerDuty AI Ecosystem	https://www.pagerduty.com/newsroom/pagerduty-expands-ai-ecosystem-to-supercharge-ai-agents/	Checked

4th Verification Details

Freshness Corrections

Item	Verification	Result
A2A latest	Official specification identifies latest released version as v1.0.0	Corrected in text
MCP 2025-11-25	OAuth 2.1, Protected Resource Metadata, Client ID Metadata Documents, token audience binding, token passthrough prohibition verified	Ch1 expanded
OpenAI pricing	GPT-5.5, GPT-5.4, GPT-5.4 mini pricing verified. Removed older GPT-5.4 nano framing	Ch6 corrected
Anthropic pricing	Opus 4.7/4.6/4.5, Sonnet 4.6/4.5, Haiku 4.5 pricing and prompt caching multiplier verified	Ch6 corrected
DeepSeek pricing	Current official pricing centers on DeepSeek V4 Flash/Pro. Removed V3.2-centered pricing table	Ch6 corrected
OpenTelemetry GenAI	GenAI semantic conventions status verified as Development	Ch5 corrected
OWASP AOS	AOS verified as work-in-progress public project	Ch5 corrected
OpenAI Agents SDK	Guardrails, human review, resumable state, MCP, tracing, agent evals, and voice-agent operating surfaces verified	Ch3-Ch5 expanded
OWASP MCP/Skills	MCP Top 10 and Agentic Skills Top 10 controls for supply chain, permissions, and telemetry verified	Ch4/Ch8 expanded
Link status	LangSmith Fleet, Braintrust Loop, and PagerDuty source links replaced with current official URLs	Updates corrected

4th Verification Sources

Source	Checked Area
OpenAI API Pricing	GPT-5.5/GPT-5.4/GPT-5.4 mini, Batch, tool/container pricing
OpenAI Agents SDK docs	guardrails, human review, MCP, tracing, agent evals, voice agents
Model Context Protocol	current specification 2025-11-25, authorization security
A2A Protocol	latest v1.0.0, task/streaming/push notification/security considerations
OpenTelemetry	GenAI semantic conventions Development status
OWASP	AOS, MCP Top 10, Agentic Skills Top 10
Anthropic Claude docs	model pricing, prompt caching, long context pricing
Claude guardrails docs	streaming refusal handling
DeepSeek API Docs	current models/pricing and V3.2 release context
LangChain/Braintrust/PagerDuty	Fleet, Loop, AI operations ecosystem

This verification focuses on document structure and operating-framework consistency. Vendor features and API signatures can change. Recheck model prices, discounts, deprecations, and benchmark names against official pages before a release.

Verification Baseline

2026-05-18 (English edition translated from the Korean 4th verification baseline dated 2026-05-17)

Verification Scope

Page structure: meta.json declarations match actual files.
Formula and metric consistency: Unit cost, Error budget, Burn rate, SLI/SLO.
Cross-chapter flow: evaluation to release to observability to incident response.
External reference link reachability.

History Archive

2026-03-13 and 2026-03-26 verification records are separated into Verification Archive. This report focuses on the latest operating baseline and 4th verification.

Structure Verification

Item	Result
meta.json pages	12
MDX file count	12
Internal link errors	0
Missing/duplicate chapters	0

Logic Consistency

Check	Standard	Result
Cost formula consistency	Unit cost definition is consistent between Index and Ch6	Pass
Gate linkage	Ch3 evaluation criteria feed into Ch2 release gates	Pass
SLO to incident linkage	Ch5 Error budget links to Ch8 incident classification	Pass
Experiment safety	Ch7 decision formula does not conflict with Ch4 guardrails	Pass

Counterexample Scenarios

Scenario	Expected Behavior	Result
Model upgrade improves quality by +2% but raises cost by +20%	Cost gate holds release	Pass
Latency is normal but policy violation rate rises	Safety gate blocks first	Pass
SLO passes overall but a specific tenant fails more often	Tenant-segment metrics detect anomaly	Pass

External Link Check

Category	Link	Status
Google SRE Workbook	https://sre.google/workbook/table-of-contents/	Checked
OpenAI Agents SDK	https://developers.openai.com/api/docs/guides/agents	Checked
OpenAI Pricing	https://openai.com/api/pricing/	Checked
MCP Specification	https://modelcontextprotocol.io/specification/2025-11-25	Checked
A2A Specification	https://a2a-protocol.org/latest/specification/	Checked
OpenTelemetry GenAI	https://opentelemetry.io/docs/specs/semconv/gen-ai/	Checked
OWASP AOS	https://aos.owasp.org/aos/	Checked
OWASP MCP Top 10	https://owasp.org/www-project-mcp-top-10/	Checked
OWASP Agentic Skills Top 10	https://owasp.org/www-project-agentic-skills-top-10/	Checked
Claude Guardrails	https://docs.claude.com/en/docs/test-and-evaluate/strengthen-guardrails/handle-streaming-refusals	Checked
Anthropic Pricing	https://platform.claude.com/docs/en/about-claude/pricing	Checked
DeepSeek Pricing	https://api-docs.deepseek.com/quick_start/pricing/	Checked
LangSmith Fleet	https://www.langchain.com/blog/introducing-langsmith-fleet	Checked
Braintrust Loop	https://www.braintrust.dev/docs/loop	Checked
PagerDuty AI Ecosystem	https://www.pagerduty.com/newsroom/pagerduty-expands-ai-ecosystem-to-supercharge-ai-agents/	Checked

4th Verification Details

Freshness Corrections

Item	Verification	Result
A2A latest	Official specification identifies latest released version as v1.0.0	Corrected in text
MCP 2025-11-25	OAuth 2.1, Protected Resource Metadata, Client ID Metadata Documents, token audience binding, token passthrough prohibition verified	Ch1 expanded
OpenAI pricing	GPT-5.5, GPT-5.4, GPT-5.4 mini pricing verified. Removed older GPT-5.4 nano framing	Ch6 corrected
Anthropic pricing	Opus 4.7/4.6/4.5, Sonnet 4.6/4.5, Haiku 4.5 pricing and prompt caching multiplier verified	Ch6 corrected
DeepSeek pricing	Current official pricing centers on DeepSeek V4 Flash/Pro. Removed V3.2-centered pricing table	Ch6 corrected
OpenTelemetry GenAI	GenAI semantic conventions status verified as Development	Ch5 corrected
OWASP AOS	AOS verified as work-in-progress public project	Ch5 corrected
OpenAI Agents SDK	Guardrails, human review, resumable state, MCP, tracing, agent evals, and voice-agent operating surfaces verified	Ch3-Ch5 expanded
OWASP MCP/Skills	MCP Top 10 and Agentic Skills Top 10 controls for supply chain, permissions, and telemetry verified	Ch4/Ch8 expanded
Link status	LangSmith Fleet, Braintrust Loop, and PagerDuty source links replaced with current official URLs	Updates corrected

4th Verification Sources

Source	Checked Area
OpenAI API Pricing	GPT-5.5/GPT-5.4/GPT-5.4 mini, Batch, tool/container pricing
OpenAI Agents SDK docs	guardrails, human review, MCP, tracing, agent evals, voice agents
Model Context Protocol	current specification 2025-11-25, authorization security
A2A Protocol	latest v1.0.0, task/streaming/push notification/security considerations
OpenTelemetry	GenAI semantic conventions Development status
OWASP	AOS, MCP Top 10, Agentic Skills Top 10
Anthropic Claude docs	model pricing, prompt caching, long context pricing
Claude guardrails docs	streaming refusal handling
DeepSeek API Docs	current models/pricing and V3.2 release context
LangChain/Braintrust/PagerDuty	Fleet, Loop, AI operations ecosystem

Verification Limits

Scope

Verification Scope

Structure Verification

Logic Consistency

Counterexample Scenarios

External Link Check

4th Verification Details

Freshness Corrections

4th Verification Sources

Verification Limits

On This Page

Verification Report

Verification Scope

Structure Verification

Logic Consistency

Counterexample Scenarios

External Link Check

4th Verification Details

Freshness Corrections

4th Verification Sources

Verification Limits

On This Page