Ch1. System Architecture
Separate the control plane and data plane to improve both reliability and change velocity
Key takeaways
- Separate the control plane (policy, version registry, eval store) from the data plane (orchestrator, model/tool runtime, cache) to shrink blast radius while preserving iteration speed.
- Distinguish tool connectivity (MCP) from agent-to-agent communication (A2A); the 2026 pattern converges on MCP for tool/data boundaries and A2A for delegation between independent agents.
- Define trust boundaries per surface: hosted vs private MCP, and public vs extended A2A Agent Cards that hide internal URLs and secrets.
- Manage MCP servers in a registry with owner, transport, token audience, scopes, allowed egress, and approval-required tools.
- Default to safe behavior on failure: deny on policy-lookup failure, fall back to a lower-tier model, and use durable state and checkpoints for approval waits.
Many recurring LLM system failures happen because operational policy and execution logic are tangled together.
A cleaner architecture reduces blast radius while preserving the speed of iteration.
Recommended Structure
Separation Principles
| Area | Control Plane | Data Plane |
|---|---|---|
| Responsibility | Policy, versions, evaluation criteria | Request handling and response generation |
| Change cadence | Weekly/monthly | Real-time/daily |
| Failure impact | Slower decisions | Direct user impact |
2026 Agent Communication Protocols
In multi-agent systems, separate tool connectivity from agent-to-agent communication.
| Protocol | Role | Current Baseline | Operating Point |
|---|---|---|---|
| MCP (Model Context Protocol) | Agent to tools/data | 2025-11-25 | JSON-RPC based. For HTTP transport, validate OAuth 2.1, Protected Resource Metadata, and token audience binding |
| A2A (Agent2Agent) | Agent to agent | latest v1.0.0 | Provides tasks, streaming, push notifications, and extended Agent Cards. Avoid leaking resource existence before authentication |
| ACP (Agent Communication Protocol) | Agent to agent | Vendor ecosystem | REST/HTTP messaging. Validate security and interoperability before adopting it as an organizational standard |
Convergence Pattern
The 2026 operating pattern is converging toward MCP for tool/data boundaries and A2A for delegation between independent agents. They are complementary trust boundaries, not substitutes.
Trust Boundary Design
| Boundary | Operating Standard |
|---|---|
| Hosted/public MCP | Connect only public servers that fit the provider trust model; require approval for high-risk tools |
| Private/local MCP | Let the runtime own connectivity, filtering, approvals, scope limits, and network egress |
| A2A public Agent Card | Expose only public capabilities; exclude internal URLs, secrets, and detailed rate limits |
| A2A extended Agent Card | Serve only to authenticated clients and vary capabilities by client permission |
MCP Server Registry Example
mcp_servers:
- id: github-readonly-prod
owner: platform-ai
transport: streamable_http
server_url: https://mcp.example.com/github
token_audience: mcp://github-readonly-prod
scopes:
- repo:read
- issue:read
allowed_egress:
- api.github.com
approval_required_tools:
- create_issue
- write_file
expires_at: '2026-06-16'Design Checkpoints
- Default to safe behavior when policy lookup fails: deny or restricted response.
- Fall back to a lower-tier model when model routing fails.
- Enforce transaction boundaries and idempotency keys around tool calls.
- Keep a graceful degradation path when an MCP server is unavailable.
- MCP servers should accept only tokens issued for themselves and must not pass those tokens through to upstream APIs.
- Reject A2A push notification URLs that target private IPs, localhost, or link-local addresses.
- Use durable state and checkpoints for approval waits, long-running work, and external event resumption.
Baseline and Sources
| Item | Baseline Date | Recheck By | Primary Source |
|---|---|---|---|
| MCP 2025-11-25 | 2026-05-17 | 2026-06-16 | https://modelcontextprotocol.io/specification/2025-11-25 |
| A2A latest v1.0.0 | 2026-05-17 | 2026-06-16 | https://a2a-protocol.org/latest/specification/ |
| OpenAI Agents SDK MCP/tracing | 2026-05-17 | 2026-06-16 | https://developers.openai.com/api/docs/guides/agents/integrations-observability |