Spatial and Multimodal Interfaces

Key takeaways

Design systems are expanding beyond 2D screens, so they need tokens and contracts for spatial, voice, gesture, vision, and haptic modalities.
Each modality has distinct needs: spatial requires depth and anchors, voice needs confirmation and fallback, gesture needs error recovery and fatigue limits.
Spatial tokens encode depth layers and readable distances so agents place UI at legible scale.
Core rules: always provide a non-spatial fallback, never use voice-only confirmation for destructive actions, and respect camera and environment privacy.
Verify by testing readability at distance, voice fallback paths, gesture cancellation and undo, and privacy copy in permission prompts.

Design systems are expanding beyond 2D screens. Spatial computing and multimodal AI add depth, voice, gesture, gaze, and environment context. The system needs tokens and contracts for these new modalities.

New Modalities

Modality	Design-system need
Spatial	Depth, distance, scale, occlusion, anchors.
Voice	Intent, confirmation, interruption, fallback text.
Gesture	Discoverability, error recovery, fatigue limits.
Vision	Object grounding, camera privacy, visual feedback.
Haptics	Intensity, duration, semantic meaning.

Spatial Tokens

export const spatial = {
  depth: {
    foreground: 0.2,
    workspace: 0,
    background: -0.4,
  },
  distance: {
    readable: '0.8m',
    comfortable: '1.2m',
  },
}

Agent Rules

Always provide a non-spatial fallback for core tasks.
Do not rely on voice-only confirmation for destructive actions.
Respect user privacy for camera and environment context.
Keep spatial UI legible at target distance.
Define recovery when gesture or speech recognition fails.

Verification

Test text readability at distance.
Test voice fallback paths.
Test gesture cancellation and undo.
Review privacy copy and permission prompts.

Spatial and Multimodal Interfaces

Spatial computing, voice, gesture, and multimodal AI in design systems.

Key takeaways

Design systems are expanding beyond 2D screens, so they need tokens and contracts for spatial, voice, gesture, vision, and haptic modalities.
Each modality has distinct needs: spatial requires depth and anchors, voice needs confirmation and fallback, gesture needs error recovery and fatigue limits.
Spatial tokens encode depth layers and readable distances so agents place UI at legible scale.
Core rules: always provide a non-spatial fallback, never use voice-only confirmation for destructive actions, and respect camera and environment privacy.
Verify by testing readability at distance, voice fallback paths, gesture cancellation and undo, and privacy copy in permission prompts.

New Modalities

Modality	Design-system need
Spatial	Depth, distance, scale, occlusion, anchors.
Voice	Intent, confirmation, interruption, fallback text.
Gesture	Discoverability, error recovery, fatigue limits.
Vision	Object grounding, camera privacy, visual feedback.
Haptics	Intensity, duration, semantic meaning.

Spatial Tokens

export const spatial = {
  depth: {
    foreground: 0.2,
    workspace: 0,
    background: -0.4,
  },
  distance: {
    readable: '0.8m',
    comfortable: '1.2m',
  },
}

Agent Rules

Always provide a non-spatial fallback for core tasks.
Do not rely on voice-only confirmation for destructive actions.
Respect user privacy for camera and environment context.
Keep spatial UI legible at target distance.
Define recovery when gesture or speech recognition fails.

Verification

Test text readability at distance.
Test voice fallback paths.
Test gesture cancellation and undo.
Review privacy copy and permission prompts.

Spatial and Multimodal Interfaces

New Modalities

Spatial Tokens

Agent Rules

Verification

On This Page

Spatial and Multimodal Interfaces

New Modalities

Spatial Tokens

Agent Rules

Verification

On This Page