completed

Mission Run

Codex Mission Control: MCP Server + Next.js Control Plane + Antigravity Integration

Design an MCP server that exposes project, task, documentation, and execution-context tools, then integrate it with Antigravity so users can inspect repositories, trigger mission workflows, retrieve generated PRDs and technical designs, and preserve execution memory across AI coding sessions.

Created: 14 Jun 2026, 10:32 am

Updated: 14 Jun 2026, 10:44 am

Repository Context

New project from scratch. Build a Next.js control plane with an MCP server layer, local persistence, OpenAI-powered document generation, and a clear integration guide for Antigravity-compatible MCP clients.

Constraints

Do not generate code directly for users. Focus on MCP tool design, secure API-key handling, traceable mission documents, local caching, and a polished demo flow for judges.

Execution Stepper

The mission run has finished. Completed steps remain as a visible execution trace.

Mission kickoff: scope, success criteria, and demo storyboard

Completed

Select MCP feature set and tool taxonomy

Completed

Define data model and local persistence strategy

Completed

Security model: API keys, secrets, and multi-tenant boundaries

Completed

MCP server transport and compatibility plan (Antigravity-ready)

Completed

Workflow engine design for mission runs

Completed

Repository inspection tool design (safe and performant)

Completed

OpenAI-powered document generation pipeline (traceable outputs)

Completed

AI Software Execution Operating System

Primary users

Founders, product leads, architects, delivery teams, and AI-native engineering teams that need a system of record between idea and execution.

Problem solved

It transforms mission intelligence into PRDs, technical designs, engineering plans, AI execution packs, architecture maps, risk models, and traceable workflows.

Product Flow Diagram

Idea to execution intelligence

User

Submits a software idea and constraints

Mission Plan

Tasks, dependencies, owners, risks

Document Studio

PRD, TDD, engineering plan, AI pack

Mission Memory

Reusable organizational knowledge

Architecture Diagram

System components for this idea

Users

Founders + product teams

Edge

Route 53 / CDN / WAF boundary

Application VPC

Web App

Node.js

API Server

TypeScript

Worker

Mission/document generation jobs

Cache

Saved docs + fast reloads

Data Plane

Primary DB

SQLite for canonical metadata/relationships/provenance pointers, paired with a content-addressed filesystem (CAS) artifact store for large blobs (documents/logs/json) referenced as artifact://sha256/<hash>.

Object Storage

Generated docs, exports, artifacts

projectsreposmissionstasksrunsrun_eventsdocumentsdocument_revisions

AI + Operations

OpenAI API

PRD, TDD, plan, AI pack

Observability

Logs, risks, decision trail

IAM / Secrets

Server-side keys + access control

Alerts

Execution and risk signals

Cloud Diagram

Deployment-ready shape

ChannelExperienceMiddlewareResources

Browser

User session

Mission UI

Dashboard + document studio

API Routes

Validate + orchestrate

Storage

Missions + cached docs

AI Client

Codex / external tools

AI Pack

Portable execution context

Doc Engine

Timeout + local fallback

OpenAI

Optional enrichment

Security posture

Keep API keys server-side, validate payloads, and preserve audit logs.

9 risk signals

Risks become visible before execution moves to tools.

Mission Document Studio

Export mission intelligence

Turn the mission into production-ready documents for executives, engineers, delivery teams, and external AI execution tools.

Product Requirements Document

Loaded from local mission memory · 6/14/2026, 10:34:53 AM

# Product Requirements Document: Codex Mission Control: MCP Server + Next.js Control Plane + Antigravity Integration

## Product Vision
Design an MCP server that exposes project, task, documentation, and execution-context tools, then integrate it with Antigravity so users can inspect repositories, trigger mission workflows, retrieve generated PRDs and technical designs, and preserve execution memory across AI coding sessions.

## Problem Statement
Teams lose execution context between idea, architecture, planning, and handoff. This mission turns the idea into a shared system of record.

## Target Users
- Founders validating an MVP
- Product managers preparing requirements
- Engineering leads preparing implementation plans
- AI-native teams handing context to Codex or other tools

## User Personas
- Product Lead: needs a crisp PRD and success criteria.
- Engineering Lead: needs architecture, risks, and task sequencing.
- Builder: needs enough context to execute without re-asking basic questions.

## User Stories
- As a team member, I need mission kickoff: scope, success criteria, and demo storyboard so that Define the product scope and non-goals; lock success criteria for judges (end-to-end demo steps, expected outputs, latency targets); draft a demo storyboard showing Antigravity client connecting to MCP server, inspecting a repo, generating PRD/tech design, triggering a workflow, and resuming with preserved execution memory.
- As a team member, I need select mcp feature set and tool taxonomy so that Design the MCP server tool inventory grouped into: project tools (repository introspection, metadata), task tools (workflow orchestration, status), documentation tools (PRD, tech design, architecture decision records), and execution-context tools (memory, run context, artifacts). Define naming conventions, input/output JSON schemas, error shapes, pagination, and idempotency expectations.
- As a team member, I need define data model and local persistence strategy so that Create a practical local-first data model: projects, repos, missions, tasks, runs, documents, artifacts, and memory entries. Choose persistence (e.g., SQLite via Prisma or better-sqlite3) and a filesystem artifact store. Define indexing for retrieval (by project/run/task), retention policies, and export/import for judge review.
- As a team member, I need security model: api keys, secrets, and multi-tenant boundaries so that Specify secure handling for OpenAI and optional VCS tokens: environment variables, server-side only usage, never returning secrets via MCP, redaction in logs, and a secrets-check tool that validates configuration without leaking values. Define auth strategy for the control plane (local dev: single-user, token in .env; optional basic auth). Produce a threat model focused on filesystem access, prompt injection in docs, and repo content exfiltration.
- As a team member, I need mcp server transport and compatibility plan (antigravity-ready) so that Pick and document the MCP transport (stdio and/or HTTP/SSE depending on Antigravity expectations). Define how Antigravity clients discover tools and call them. Produce a compatibility checklist: tool metadata, JSON schema strictness, streaming support, and error semantics. Provide a minimal connection recipe for judges.
- As a team member, I need workflow engine design for mission runs so that Design a lightweight mission workflow runner: create mission, plan tasks, execute steps, record run logs, and attach artifacts. Ensure each step is traceable and replayable, with run IDs and deterministic prompts/config snapshots. Define tools: start_run, update_run_status, append_run_log, attach_artifact, list_runs, get_run.
- As a team member, I need repository inspection tool design (safe and performant) so that Design tools to inspect repositories from a local path: list files with allow/deny patterns, read file with size limits, search text with safeguards, summarize structure, and compute lightweight stats. Add caching rules (hash-based) and guardrails (no reading outside repo root, path traversal protection).
- As a team member, I need openai-powered document generation pipeline (traceable outputs) so that Define a document generation service that produces PRDs and technical designs with provenance: prompt templates versioned, model settings stored, input context references (files, memory IDs), and output stored as immutable document revisions. Include tools: generate_prd, generate_tech_design, list_documents, get_document, diff_document_versions.

## Functional Requirements
1. Mission kickoff: scope, success criteria, and demo storyboard: Define the product scope and non-goals; lock success criteria for judges (end-to-end demo steps, expected outputs, latency targets); draft a demo storyboard showing Antigravity client connecting to MCP server, inspecting a repo, generating PRD/tech design, triggering a workflow, and resuming with preserved execution memory.
2. Select MCP feature set and tool taxonomy: Design the MCP server tool inventory grouped into: project tools (repository introspection, metadata), task tools (workflow orchestration, status), documentation tools (PRD, tech design, architecture decision records), and execution-context tools (memory, run context, artifacts). Define naming conventions, input/output JSON schemas, error shapes, pagination, and idempotency expectations.
3. Define data model and local persistence strategy: Create a practical local-first data model: projects, repos, missions, tasks, runs, documents, artifacts, and memory entries. Choose persistence (e.g., SQLite via Prisma or better-sqlite3) and a filesystem artifact store. Define indexing for retrieval (by project/run/task), retention policies, and export/import for judge review.
4. Security model: API keys, secrets, and multi-tenant boundaries: Specify secure handling for OpenAI and optional VCS tokens: environment variables, server-side only usage, never returning secrets via MCP, redaction in logs, and a secrets-check tool that validates configuration without leaking values. Define auth strategy for the control plane (local dev: single-user, token in .env; optional basic auth). Produce a threat model focused on filesystem access, prompt injection in docs, and repo content exfiltration.
5. MCP server transport and compatibility plan (Antigravity-ready): Pick and document the MCP transport (stdio and/or HTTP/SSE depending on Antigravity expectations). Define how Antigravity clients discover tools and call them. Produce a compatibility checklist: tool metadata, JSON schema strictness, streaming support, and error semantics. Provide a minimal connection recipe for judges.
6. Workflow engine design for mission runs: Design a lightweight mission workflow runner: create mission, plan tasks, execute steps, record run logs, and attach artifacts. Ensure each step is traceable and replayable, with run IDs and deterministic prompts/config snapshots. Define tools: start_run, update_run_status, append_run_log, attach_artifact, list_runs, get_run.
7. Repository inspection tool design (safe and performant): Design tools to inspect repositories from a local path: list files with allow/deny patterns, read file with size limits, search text with safeguards, summarize structure, and compute lightweight stats. Add caching rules (hash-based) and guardrails (no reading outside repo root, path traversal protection).
8. OpenAI-powered document generation pipeline (traceable outputs): Define a document generation service that produces PRDs and technical designs with provenance: prompt templates versioned, model settings stored, input context references (files, memory IDs), and output stored as immutable document revisions. Include tools: generate_prd, generate_tech_design, list_documents, get_document, diff_document_versions.

## Non Functional Requirements
- Reliable loading and error states
- Server-side secret handling
- Clear audit trail
- Reusable mission context

## Success Metrics
- Mission can be understood in under 2 minutes
- PRD/TDD/plan/AI pack can be exported
- Risks and dependencies are visible
- Next execution steps are obvious

## Risks
- Scope can expand quickly if integrations, auth, or real-time collaboration are added too early.
- Demo quality depends on seeded data and deterministic empty/loading states.
- AI features need graceful fallback when provider latency or configuration fails.

## Assumptions
- MVP scope is prioritized over integrations
- Seeded/demo data is acceptable for hackathon validation
- Execution happens in external tools after context is prepared

## Future Scope
- Team collaboration
- Mission comparison
- Reusable architecture templates
- Deeper integrations with project management tools

Traceability Map

Why every task exists

This replaces vague “AI said so” planning. Each path shows which goal, requirement, task, architecture choice, or risk explains the work.

goal

Design an MCP server that exposes project, task, documentation, and execution-context tools, then integrate it with Antigravity so users can inspect repositories, trigger mission workflows, retrieve generated PRDs and technical designs, and preserve execution memory across AI coding sessions.

requirement

Mission kickoff: scope, success criteria, and demo storyboard

task

Mission kickoff: scope, success criteria, and demo storyboard

requirement

Select MCP feature set and tool taxonomy

task

Select MCP feature set and tool taxonomy

requirement

Define data model and local persistence strategy

task

Define data model and local persistence strategy

requirement

Security model: API keys, secrets, and multi-tenant boundaries

task

Security model: API keys, secrets, and multi-tenant boundaries

requirement

MCP server transport and compatibility plan (Antigravity-ready)

task

MCP server transport and compatibility plan (Antigravity-ready)

requirement

Workflow engine design for mission runs

task

Workflow engine design for mission runs

requirement

Repository inspection tool design (safe and performant)

Trace paths

Kept because traceability is the product moat; renamed from relationships for clarity.

Design an MCP server that exposes project, task, documentation, and execution-context tools, then integrate it with Antigravity so users can inspect repositories, trigger mission workflows, retrieve generated PRDs and technical designs, and preserve execution memory across AI coding sessions.drivesMission kickoff: scope, success criteria, and demo storyboard
Mission kickoff: scope, success criteria, and demo storyboardsatisfied byMission kickoff: scope, success criteria, and demo storyboard
Design an MCP server that exposes project, task, documentation, and execution-context tools, then integrate it with Antigravity so users can inspect repositories, trigger mission workflows, retrieve generated PRDs and technical designs, and preserve execution memory across AI coding sessions.drivesSelect MCP feature set and tool taxonomy
Select MCP feature set and tool taxonomysatisfied bySelect MCP feature set and tool taxonomy
Mission kickoff: scope, success criteria, and demo storyboardunblocksSelect MCP feature set and tool taxonomy
Design an MCP server that exposes project, task, documentation, and execution-context tools, then integrate it with Antigravity so users can inspect repositories, trigger mission workflows, retrieve generated PRDs and technical designs, and preserve execution memory across AI coding sessions.drivesDefine data model and local persistence strategy
Define data model and local persistence strategysatisfied byDefine data model and local persistence strategy
Select MCP feature set and tool taxonomyunblocksDefine data model and local persistence strategy
Design an MCP server that exposes project, task, documentation, and execution-context tools, then integrate it with Antigravity so users can inspect repositories, trigger mission workflows, retrieve generated PRDs and technical designs, and preserve execution memory across AI coding sessions.drivesSecurity model: API keys, secrets, and multi-tenant boundaries
Security model: API keys, secrets, and multi-tenant boundariessatisfied bySecurity model: API keys, secrets, and multi-tenant boundaries

Mission Decision Log

Explain the important choices

Use Node.js

Selected because it appears in the mission architecture or implementation summary.

Tradeoffs

Keeps the plan coherent, but may need validation against team skills, cost, and deployment constraints.

Alternatives

Comparable frameworks, managed services, or lower-level custom implementation.

Use TypeScript

Selected because it appears in the mission architecture or implementation summary.

Tradeoffs

Keeps the plan coherent, but may need validation against team skills, cost, and deployment constraints.

Alternatives

Comparable frameworks, managed services, or lower-level custom implementation.

Use MCP server (stdio-first transport; optional HTTP + SSE)

Selected because it appears in the mission architecture or implementation summary.

Tradeoffs

Keeps the plan coherent, but may need validation against team skills, cost, and deployment constraints.

Alternatives

Comparable frameworks, managed services, or lower-level custom implementation.

Use Next.js (Control Plane UI)

Selected because it appears in the mission architecture or implementation summary.

Tradeoffs

Keeps the plan coherent, but may need validation against team skills, cost, and deployment constraints.

Alternatives

Comparable frameworks, managed services, or lower-level custom implementation.

Use SQLite

Selected because it appears in the mission architecture or implementation summary.

Tradeoffs

Keeps the plan coherent, but may need validation against team skills, cost, and deployment constraints.

Alternatives

Comparable frameworks, managed services, or lower-level custom implementation.

Use SQLite for canonical metadata/relationships/provenance pointers, paired with a content-addressed filesystem (CAS) artifact store for large blobs (documents/logs/json) referenced as artifact://sha256/<hash>.

The mission requires a persistence model for reusable execution knowledge and structured planning outputs.

Tradeoffs

Adds schema discipline and traceability, but increases setup and migration responsibility.

Alternatives

Local JSON, SQLite, Supabase, PostgreSQL, or document storage depending on scale.

Mission Memory

Reuse organizational knowledge

Compare MissionsPlanned capability, not shown as an executable action yet. Current executable memory action is duplicate.

Mission Steps

Task Timeline

8 tasks

Task 1

Mission kickoff: scope, success criteria, and demo storyboard

completed

Define the product scope and non-goals; lock success criteria for judges (end-to-end demo steps, expected outputs, latency targets); draft a demo storyboard showing Antigravity client connecting to MCP server, inspecting a repo, generating PRD/tech design, triggering a workflow, and resuming with preserved execution memory.

Owner: plannerDependencies: 0
Mission kickoff completed conceptually: defined scope, success criteria, and a 6-minute demo storyboard for a local-first Next.js Control Plane with an MCP server (repo inspection, mission/task management, PRD + Technical Design generation, workflow runs, and execution-memory persistence) plus Antigravity client integration guidance.

Task 2

Select MCP feature set and tool taxonomy

completed

Design the MCP server tool inventory grouped into: project tools (repository introspection, metadata), task tools (workflow orchestration, status), documentation tools (PRD, tech design, architecture decision records), and execution-context tools (memory, run context, artifacts). Define naming conventions, input/output JSON schemas, error shapes, pagination, and idempotency expectations.

Owner: builderDependencies: 1
Completed selection of MCP feature set and defined a v1 tool taxonomy (project/task/docs/context) with consistent naming/IDs, standard envelope+errors, cursor pagination, and idempotent mutating tools. Included security rules (path sandboxing, server-side OpenAI key only, redaction) plus local-first persistence/caching and provenance for traceable docs and runs.

Task 3

Define data model and local persistence strategy

completed

Create a practical local-first data model: projects, repos, missions, tasks, runs, documents, artifacts, and memory entries. Choose persistence (e.g., SQLite via Prisma or better-sqlite3) and a filesystem artifact store. Define indexing for retrieval (by project/run/task), retention policies, and export/import for judge review.

Owner: builderDependencies: 1
Completed conceptual design for a local-first persistence system: SQLite holds canonical metadata/relationships/provenance pointers; a content-addressed filesystem artifact store holds large blobs (documents, logs, large JSON). Defined entities (Project, Repo, Mission, Task, Run, RunEvent, Document, Artifact, MemoryEntry, Embedding), key relationships/enums, indexing for project/mission/task/run-centric queries, retention/GC policies with judge-mode to preserve trace, and a deterministic export/import zip bundle format with checksums and path sandboxing.

Task 4

Security model: API keys, secrets, and multi-tenant boundaries

completed

Specify secure handling for OpenAI and optional VCS tokens: environment variables, server-side only usage, never returning secrets via MCP, redaction in logs, and a secrets-check tool that validates configuration without leaking values. Define auth strategy for the control plane (local dev: single-user, token in .env; optional basic auth). Produce a threat model focused on filesystem access, prompt injection in docs, and repo content exfiltration.

Owner: builderDependencies: 2
Completed security model for API keys/secrets and tenant boundaries. Defined server-only secret handling, redaction rules, default-deny filesystem sandboxing under MCP_ALLOWED_ROOT, default-local network posture (127.0.0.1, auth on by default, egress off unless allowlisted), MCP non-leak guarantees (response filters + schemas without secret fields), control-plane auth modes (bearer/basic/reverse-proxy trust), single-user local tenant boundary v1, and a focused threat model (filesystem access, prompt injection, repo exfiltration). Added MCP tool specs: security.secrets.check (non-leaking config validation) and security.auth.describe (auth/tenant mode report).

Task 5

MCP server transport and compatibility plan (Antigravity-ready)

completed

Pick and document the MCP transport (stdio and/or HTTP/SSE depending on Antigravity expectations). Define how Antigravity clients discover tools and call them. Produce a compatibility checklist: tool metadata, JSON schema strictness, streaming support, and error semantics. Provide a minimal connection recipe for judges.

Owner: builderDependencies: 2
Completed MCP transport + Antigravity compatibility plan: stdio is the canonical transport; optional HTTP + SSE for browser/control-plane and judge demos. Defined deterministic tool discovery, strict JSON Schema (additionalProperties=false), invocation/response contracts with idempotency via clientRequestId, streaming progress/log/result events with non-stream fallback, canonical error format/codes with traceId, and a minimal judge connection recipe + compatibility checklist.

Task 6

Workflow engine design for mission runs

completed

Design a lightweight mission workflow runner: create mission, plan tasks, execute steps, record run logs, and attach artifacts. Ensure each step is traceable and replayable, with run IDs and deterministic prompts/config snapshots. Define tools: start_run, update_run_status, append_run_log, attach_artifact, list_runs, get_run.

Owner: builderDependencies: 1
Designed a lightweight, traceable, replayable mission workflow runner: immutable run/step snapshots, append-only RunEvent log, artifact attachment with provenance, idempotent MCP tools (runs.start_run, runs.update_run_status, runs.append_run_log, runs.attach_artifact, runs.list_runs, runs.get_run), status transition rules, optimistic concurrency via version/expectedVersion, and CAS-backed storage for snapshots/log payloads/artifacts.

Task 7

Repository inspection tool design (safe and performant)

completed

Design tools to inspect repositories from a local path: list files with allow/deny patterns, read file with size limits, search text with safeguards, summarize structure, and compute lightweight stats. Add caching rules (hash-based) and guardrails (no reading outside repo root, path traversal protection).

Owner: builderDependencies: 2
Completed design of a safe, performant repository inspection toolset: repo.register plus read-only tools (list_files, read_file, search_text, summarize_structure, compute_stats) with strict sandboxing (MCP_ALLOWED_ROOT, traversal + symlink-escape protection), resource limits (time/bytes/files), default deny globs for secrets/heavy dirs, deterministic outputs (stable ordering/cursors), and a fingerprint-based caching layer (LRU + disk cache with TTL and stale-while-revalidate).

Task 8

OpenAI-powered document generation pipeline (traceable outputs)

completed

Define a document generation service that produces PRDs and technical designs with provenance: prompt templates versioned, model settings stored, input context references (files, memory IDs), and output stored as immutable document revisions. Include tools: generate_prd, generate_tech_design, list_documents, get_document, diff_document_versions.

Owner: builderDependencies: 2
Designed an OpenAI-powered document generation pipeline for PRDs and technical designs with immutable revisions and end-to-end provenance (template version/digest, model params, input context refs/hashes, CAS artifacts). Includes MCP tools for generate/list/get/diff, idempotency and concurrency controls, storage/export approach, and security defaults (server-only keys, prompt redaction, denylisted repo paths).

Audit Trail

Execution Log

43 logs

Mission plan generated successfully.

success

14 Jun 2026, 10:32 am

Mission execution started.

info

14 Jun 2026, 10:35 am

Starting task 1: Mission kickoff: scope, success criteria, and demo storyboard

info

14 Jun 2026, 10:35 am

Mission kickoff artifacts drafted: scope, success criteria, demo storyboard.

info

14 Jun 2026, 10:36 am

Completed task 1: Mission kickoff: scope, success criteria, and demo storyboard

success

14 Jun 2026, 10:36 am

Starting task 2: Select MCP feature set and tool taxonomy

info

14 Jun 2026, 10:36 am

Defined MCP tool taxonomy with 4 groups (project/task/docs/context) and a consistent naming + ID scheme.

info

14 Jun 2026, 10:37 am

Standardized envelope response, error shapes, cursor pagination, and idempotency requirements for mutating tools.

info

14 Jun 2026, 10:37 am

Included security constraints: path sandboxing, server-side OpenAI key only, and redaction rules.

info

14 Jun 2026, 10:37 am

Task 1 'Select MCP feature set and tool taxonomy' marked completed (2026-06-14T00:00:00Z).

success

14 Jun 2026, 10:37 am

Completed task 2: Select MCP feature set and tool taxonomy

success

14 Jun 2026, 10:37 am

Starting task 3: Define data model and local persistence strategy

info

14 Jun 2026, 10:37 am

Selected SQLite + filesystem artifact store; defined core entities and relationships.

info

14 Jun 2026, 10:38 am

Defined indexing and retrieval patterns for project/mission/task/run centric queries.

info

14 Jun 2026, 10:38 am

Specified retention/GC and judge-mode to preserve full trace during evaluation.

info

14 Jun 2026, 10:38 am

Completed task 3: Define data model and local persistence strategy

success

14 Jun 2026, 10:38 am

Starting task 4: Security model: API keys, secrets, and multi-tenant boundaries

info

14 Jun 2026, 10:38 am

Mission stop requested. The run will halt after the current task finishes.

warning

14 Jun 2026, 10:39 am

Created artifacts doc_security_model_v1 and doc_threat_model_table_v1; added tools security.secrets.check and security.auth.describe.

info

14 Jun 2026, 10:39 am

Policy defaults set: bind 127.0.0.1, auth bearer_token_env, network egress disabled, filesystem sandboxed (readonly repo + artifact write-only).

info

14 Jun 2026, 10:39 am

Completed task 4: Security model: API keys, secrets, and multi-tenant boundaries

success

14 Jun 2026, 10:39 am

Starting task 5: MCP server transport and compatibility plan (Antigravity-ready)

info

14 Jun 2026, 10:39 am

Selected stdio as canonical MCP transport; defined HTTP/SSE as optional secondary transport for control-plane UI and judge demo.

info

14 Jun 2026, 10:40 am

Defined strict tool metadata + JSON schema requirements, streaming event contract, and canonical error semantics with traceId.

info

14 Jun 2026, 10:40 am

Auth posture: stdio local trust + sandbox/redaction; HTTP requires bearer by default, optional judge-only bypass under explicit flag.

info

14 Jun 2026, 10:40 am

Produced mcp-transport-compatibility-plan.v1 covering transport selection, contracts, checklist, and judge recipe.

success

14 Jun 2026, 10:40 am

Completed task 5: MCP server transport and compatibility plan (Antigravity-ready)

success

14 Jun 2026, 10:40 am

Starting task 6: Workflow engine design for mission runs

info

14 Jun 2026, 10:40 am

Defined run/step/event/artifact model for traceable, replayable mission runs with deterministic plan/config snapshots.

info

14 Jun 2026, 10:41 am

Specified MCP tools for run lifecycle: start_run, update_run_status, append_run_log, attach_artifact, list_runs, get_run (schemas + idempotency).

info

14 Jun 2026, 10:41 am

Documented runner behavior for client-driven execution (v1) with future server-driven background runner (v2).

info

14 Jun 2026, 10:41 am

Completed task 6: Workflow engine design for mission runs

success

14 Jun 2026, 10:41 am

Starting task 7: Repository inspection tool design (safe and performant)

info

14 Jun 2026, 10:41 am

Designed repo inspection MCP tools: register, list_files, read_file, search_text, summarize_structure, compute_stats with sandboxing, limits, caching and default denies.

info

14 Jun 2026, 10:42 am

Added path traversal and symlink-escape protections, denylist for secrets, and caps for time/bytes/files to prevent exfiltration and DoS.

info

14 Jun 2026, 10:42 am

Defined hash-based fingerprints and LRU+disk cache with TTL and stale-while-revalidate; avoided expensive hashing by default.

info

14 Jun 2026, 10:42 am

Completed task 7: Repository inspection tool design (safe and performant)

success

14 Jun 2026, 10:42 am

Starting task 8: OpenAI-powered document generation pipeline (traceable outputs)

info

14 Jun 2026, 10:42 am

Defined doc generation pipeline with immutable revisions + provenance (template version/digest, model params, input context refs, and artifact hashes).

info

14 Jun 2026, 10:43 am

Specified MCP tools for generating PRD/Tech Design and for listing/getting/diffing document versions; added idempotency + concurrency controls.

info

14 Jun 2026, 10:43 am

Outlined secure OpenAI key handling (server-only), redaction of prompts/inputs, and provenance guarantees suitable for judges (traceable, exportable).

info

14 Jun 2026, 10:43 am

Completed task 8: OpenAI-powered document generation pipeline (traceable outputs)

success

14 Jun 2026, 10:43 am

Mission completed successfully.

success

14 Jun 2026, 10:44 am

Mission Outputs

Artifacts

27 artifacts

Mission Plan

plan

This mission delivers a from-scratch Codex Mission Control experience: an MCP server layer plus a Next.js control plane that lets an Antigravity-compatible client inspect repositories, trigger mission workflows, generate PRDs/technical designs, and resume work with preserved execution memory. Execution starts by locking success criteria and a judge-centric demo storyboard (Task 0) so every subsequent decision supports a reliable end-to-end flow. Next, the MCP tool taxonomy is defined (Task 1) with strict schemas and predictable errors to minimize client integration friction. With tools defined, the project’s data model and local persistence are specified (Task 2) to ensure traceability: every mission run, tool call, generated document revision, and memory entry is durable and exportable. Security is treated as a first-class deliverable (Task 3): server-side-only API key usage, redaction policies, and explicit boundaries for filesystem/repository access. Transport and compatibility decisions (Task 4) align the MCP server with Antigravity expectations, including discovery, error semantics, and any streaming requirements. The operational core is a lightweight workflow engine (Task 5) that records run IDs, statuses, logs, and artifacts for replayability. Repository inspection tools (Task 6) are designed with guardrails (repo-root confinement, size limits, caching) so clients can safely query real codebases. On top of that foundation, the OpenAI document pipeline (Task 7) generates PRDs and technical designs with provenance: prompt template versions, model settings, and referenced inputs are stored alongside immutable document revisions. The execution memory system (Task 8) preserves context across sessions with scoping, redaction, and retrieval tools. The Next.js control plane (Task 9) is built around the demo flow: a project can be created, a repo inspected, a mission run initiated, documents generated and reviewed with provenance, and memory browsed to show continuity across sessions. Observability and auditability (Task 10) ensure judge confidence: structured logs, a timeline, and a downloadable demo bundle that contains all artifacts. Finally, the Antigravity integration guide and a deterministic demo script (Task 11) make the project easy to run and evaluate. The mission closes with quality gates (Task 12): schema validation, security checks, caching correctness, and UX polish to ensure the system performs reliably under live judging conditions without exposing secrets or generating direct code for users.

Scope (artifact.scope.v1)

summary

Goal: local-first Next.js Control Plane hosting a local MCP server with tools for repo inspection, mission/task management, doc generation (PRD + Tech Design), workflow execution, and execution-memory persistence; includes Antigravity integration guide and polished demo flow. In-scope: UI (Projects/Missions/Docs/Runs/Settings), MCP tools (project/task/doc/run/context/memory), SQLite persistence, secure OpenAI key handling, traceable/cached doc generation keyed by commit+params. Non-goals: SaaS/multitenant, CI/CD/GitHub App, default shell/code execution, mandatory file editing/PR automation, complex vector stack.

Success Criteria (artifact.success_criteria.v1)

analysis

Judge-facing: Antigravity connects and lists tools; create/select project and repo path; repo inspection summary with commit hash + cache status + trace_id; generate PRD and Tech Design with stable IDs and input references; trigger workflow run producing run log + task_plan.json; reconnect and retrieve preserved execution memory; UI shows artifacts/runs with trace IDs. Targets: tool listing p95 200ms; repo inspect p95 1500ms; PRD p95 12s; Tech Design p95 15s; memory get p95 300ms; cache hit >=60% faster. Reliability: >=95% happy-path across 5 runs; 30 min crash-free; missing key returns actionable error while non-LLM tools work. Security: never return API key; refuse LLM tools without key; repo access constrained to project root. Traceability: every artifact includes id/created_at/tool/inputs hash+refs/trace_id/repo_commit_hash; run logs persisted.

Demo Storyboard (artifact.demo_storyboard.v1)

execution

6-minute flow: (1) Connect Antigravity to MCP; list tools + server info + key status. (2) Create/select project and attach repo path; confirm root scope. (3) Repo inspect (read-only); show summary JSON, commit hash, cold cache; optionally view in UI. (4) Generate PRD; show doc preview + document_id + trace_id + input refs. (5) Generate Tech Design referencing PRD + repo summary; show architecture/MCP tools/persistence/security/caching + linked metadata. (6) Trigger workflow demo.bootstrap_plan; show status transitions, run log with durations, produced task_plan.json. (7) Reconnect; fetch memory/context; show recent artifacts, key decisions, last run pointers. Fallbacks: stub/cached artifacts on OpenAI failure; reduce depth/use cache if repo scan slow; use UI as secondary invocation path if transport mismatch.

Selected MCP feature set

analysis

Repository/project introspection with safe local path scoping; task/workflow orchestration with runs/events/status; traceable doc generation (PRD/Tech Design/ADR) with provenance links; execution-context memory + artifacts store; local-first persistence + caching with pagination and stable identifiers; consistent MCP tool naming, schemas, and error shapes.

Design principles

review

Tools are side-effect explicit; mutating tools are idempotent via idempotency_key; all entities return stable IDs + created_at/updated_at; cursor pagination with deterministic ordering; typed machine-readable errors with request_id; project path access sandboxed via allowlisted roots.

MCP Tool Inventory + Schemas (v1)

summary

Artifact artifact:mcp_tool_taxonomy_v1 (application/json). Defines naming conventions (group.action), ID prefixes (proj_/repo_/task_/wf_/run_/doc_/adr_/mem_/art_/evt_), RFC3339 UTC timestamps, schema_versioning in responses. Common types: Envelope {schema_version, request_id, ok, data, error, meta{cursor_next, has_more, warnings}}, ErrorShape {type, message, details, retryable, request_id, status_code}, Pagination {cursor, limit 1-200 default 50, order asc|desc default desc}, Idempotency {idempotency_key required for mutating tools, scope tool|project}. Error catalog includes invalid_request/unauthorized/forbidden/not_found/conflict/rate_limited/timeout/provider_error/io_error/db_error/precondition_failed. Tool groups: project (list/get/create/repo.status/repo.tree/repo.read_file/repo.search), task (list/create/update/workflow.list/run.create/run.get/run.events/run.cancel), docs (list/get/generate/adr.record), context (session.open/memory.put/memory.search/artifact.put/artifact.get/run_context.get). Security: server-side OpenAI key only, allowlisted-root sandboxing with path normalization, secret redaction. Persistence/caching: SQLite entities + file cache; repo tree/file read caching keyed by sha256(path+mtime+size) with TTL and invalidation on dirty/commit change; provenance via run_id + inputs_hash and stored tool invocations/event pointers.

Local data model v1 (conceptual schema)

analysis

Entities: Project, Repo, Mission, Task, Run, RunEvent, Document, Artifact, MemoryEntry, Embedding. ID format ULID; slugs kebab-case. Document versioning is immutable (new row per regen; latest = max(version) per scope/type). RunEvents support payload offload via dataRef to artifact store.

SQLite + artifact-store persistence strategy

execution

SQLite at ./.mcp/controlplane.db stores metadata/relationships/state/indexes/provenance pointers. Artifact store at ./.mcp/artifacts/ uses content addressing sha256/<aa>/<hash> with refs like artifact://sha256/<hash>. Writes sandboxed to ./.mcp; repo roots read-only by default.

Judge export/import bundle format

summary

Deterministic zip: db.sqlite, artifacts/ (referenced only unless --full), index.json (entities + checksums), README.txt, documents/ (materialized latest PRD/TechDesign). Import validates checksums, refuses paths outside .mcp, supports new namespace or merge-by-ulid with dedup by sha256.

Security Model v1 (Secrets, Auth, Tenant Boundaries, Threat Model)

summary

Principles: server-side secrets only; least privilege; filesystem sandbox under MCP_ALLOWED_ROOT (normalize/realpath, prevent symlink escape); default-local binding; default-deny network egress; structured redacted logs. Secret handling: env vars only; no DB/artifact/public config storage; server-side OpenAI/VCS calls; sanitize provider errors. Redaction: regex detectors + header allow/deny lists + truncation. MCP non-leak: response redaction filter; schemas avoid secret fields; TRACE_PROMPTS gated and redacted. Auth: default bearer_token_env; optional basic_auth_env and reverse_proxy_trust; no long-lived cookies; future cookie hardening noted. Multi-tenant: v1 single_user_local; project->subdir mapping; future tenant_id + per-tenant roots/keys. Threat focus: filesystem traversal/symlink/secret files; prompt injection; repo exfiltration; mitigations and residual risks documented. Demo notes for judges included.

Threat Model Table v1

analysis

Structured threat/mitigation/residual-risk set covering: filesystem access (path traversal, symlink escape, sensitive dotfiles, repo hook writes), prompt injection in docs (instructions to reveal secrets/exfiltrate/copy content), and repo content exfiltration (broad reads forwarded to providers, accidental large uploads, unbounded egress).

MCP Tool Spec: security.secrets.check (v1)

review

Validates required/optional secrets and runtime constraints without disclosing values (no values/hashes/prefixes/lengths; no external provider checks unless explicitly configured). Input: scopes [openai|vcs_github|vcs_gitlab|control_plane_auth], strict boolean. Output: ok, checks[] (scope/name/status/reason/remediation), runtime flags (server_side_only_enforced, redaction_enabled, mcp_response_filters_enabled).

MCP Tool Spec: security.auth.describe (v1)

review

Reports configured auth mode and tenant boundary mode without disclosing tokens. Output: mode (none_localhost_only|bearer_token_env|basic_auth_env|reverse_proxy_trust), tenant_boundary (single_user_local|multi_tenant_soft|multi_tenant_hard), notes[].

Key decisions

analysis

Primary transport: MCP over stdio (spawned subprocess). Secondary (optional): MCP over HTTP with SSE streaming. Strict JSON Schema for all tools (additionalProperties=false). Streaming supported via progress/log events with guaranteed final non-stream result.

Minimal judge connection recipe

summary

Stdio: set MCP_ALLOWED_ROOT and server-only OPENAI_API_KEY; run `node <entrypoint>` via MCP client; refresh tools; call repo.inspect {repoPath:'.', depth:2} and verify artifactRef; call docs.generate.prd and verify documentId, artifactRef, provenance (runId/sourceRefs). Optional HTTP/SSE: bind localhost, use bearer (or explicit JUDGE_MODE bypass), GET /mcp/manifest, POST /mcp/call, watch /mcp/stream for progress -> terminal result.

Compatibility checklist highlights

review

Transport: stdio handshake; optional HTTP+SSE. Tool metadata: stable discoverable tool list with input/output schemas. Invocation: schema validation with INVALID_ARGUMENT + validationErrors; mutations accept clientRequestId for dedupe. Streaming: progress/log events with runId; terminal result includes artifact refs. Errors: canonical format with traceId; no secrets; retryable set for transient. Security: sandbox paths under MCP_ALLOWED_ROOT (PathOutsideSandbox), HTTP bearer by default, egress allowlist (EgressBlocked).

Workflow runner v1 spec (excerpt)

summary

Core entities: Run (runId ULID/UUIDv7, missionId/projectId, status, timestamps, planSnapshot/configSnapshot/repoStateSnapshot hashes, promptSnapshotRefs, idempotencyKey/clientRequestId, traceId); Step (stepId, dependsOn, status, determinism fields); RunEvent append-only (types: RUN_CREATED, RUN_STATUS_CHANGED, STEP_STARTED/PROGRESS/COMPLETED, LOG_APPENDED, ARTIFACT_ATTACHED, ERROR_RECORDED; fields include eventId, ts, level, message, dataRef, optional hashPrev); Artifact (name/kind/mimeType/contentRef + provenance). Runner: orchestration-only, records intent/snapshots/events; supports replay via snapshots + CAS refs; optimistic concurrency with version/expectedVersion; strict status transition rules.

MCP tools and persistence notes

analysis

Tools (idempotent): runs.start_run (create run with plan/config/repo snapshots, returns runId + snapshot hashes + eventsCursor); runs.update_run_status (run + step updates, expectedVersion, returns version); runs.append_run_log (step-scoped optional, CAS spill for large data, returns eventId/ts/dataRef); runs.attach_artifact (CAS content, provenance links); runs.list_runs (cursor pagination + status filter); runs.get_run (include plan/config/events/artifacts, eventsCursor/eventsLimit). Persistence: tables Run/RunStep/RunEvent/Artifact; RunEvent is append-only and is the audit source of truth; status fields can be derived/updated for fast queries; CAS stores canonical JSON snapshots and large payloads/artifact content by hash.

Produced files (as referenced by task)

review

workflow-runner-v1-spec.json: Tool specs, status transitions, event taxonomy, snapshot requirements, idempotency/concurrency guidance. run-event-taxonomy-v1.json: Canonical RunEvent types/fields; UI timeline guidance and CAS spillover approach.

Repository inspection MCP toolset (safe + performant)

summary

Tools: repo.register (sandboxed repoId + persisted policy), repo.list_files (allow/deny + pagination + stable cursors + optional stats/hashes), repo.read_file (size/encoding caps + optional base64), repo.search_text (ripgrep-like with caps/timeout/regex safeguards), repo.summarize_structure (tree + highlights, no content reads), repo.compute_stats (counts/top extensions/bytes). Guardrails: allowed-root realpath enforcement, traversal + NUL rejection, symlink-escape prevention, denylist for secrets/heavy dirs, deterministic ordering, additionalProperties=false in schemas.

Caching + fingerprinting design

analysis

Goals: avoid repeated directory walks; stay correct under repo changes without git. Keys: RepoSnapshotKey=sha256(realRepoPath+policyHash+snapshotSalt). ListCacheKey/ReadCacheKey/SearchCacheKey derived from snapshot key + op params + fingerprints. Fingerprints: fileFingerprint=sha256(path+size+mtime); dirFingerprint=sha256(dirPath+stable child metadata for first N + childCount + dirMtime); scopeFingerprint=sha256(concat(dirFingerprints)). Storage: in-memory LRU (maxEntries=500, ~32MB, ttl=30s) + disk cache (CAS blobs + SQLite index, ttl=5m, SWR=10m). Rules: never cache large file contents; policy hash invalidates; search caches store limited match snippets only.

Security + safety safeguards

review

Path sandbox: repo must be registered under MCP_ALLOWED_ROOT; normalize repo-relative paths and reject absolute/.. /UNC/NUL; join then realpath and enforce under repo root. Resource limits: per-request time budgets (search ~1500ms; list/summary ~2000ms), file read caps (256KB default; 1MB hard), walk caps (depth/entries/files). Sensitive data: default deny globs for secrets; refused reads for denied paths; search never enumerates/returns denied paths. Output stability: deterministic ordering + stable cursors; strict schemas to reduce injection surface.

Pipeline overview & immutability/provenance design

analysis

Goal: generate PRDs/Tech Designs with traceable provenance and immutable revisions for repeatability, diffing, and audit trails. Steps: resolve context refs (repo/memory/doc/run, capture hashes); assemble prompt from pinned template -> digest; call OpenAI server-only (record model/params); post-process & validate (optionally formatting pass as separate modelCall); persist immutable DocumentRevision with CAS blob refs. Immutability: content and provenance stored as CAS (sha256) and never altered; corrections create new revision with parentRevisionId.

MCP tools (generate/list/get/diff) conventions

execution

Tools: docs.generate_prd, docs.generate_tech_design, docs.list_documents, docs.get_document, docs.diff_document_versions. Conventions: idempotency via clientRequestId (optionally forceNewRevision); concurrency via expectedDocumentLatestRevisionId returning CONFLICT on mismatch; pagination via stable cursor. get_document supports includeContent/includeProvenance and redactPrompt default true; diff supports unified/summary/json_patch.

Security, storage, and export expectations

review

Security rules: OpenAI API key server-only and never stored in provenance; prompt redaction by default; repo ingestion uses sandboxed read-only tools with denylist (.env/keys) and size caps; provenance stores model params not credentials/headers; optional secret-scanning heuristics emit warnings. Storage: SQLite metadata tables for Document and DocumentRevision + indexes; CAS artifacts store prompts (redacted optional), content, provenance JSON, optional metadata. Export: zip bundle with sha256 manifest including all referenced artifacts to verify provenance offline.

Key data model additions

summary

Entities: Document (metadata, latestRevisionId, status/tags) and DocumentRevision (monotonic index, parentRevisionId, content/provenance artifact IDs + sha256, createdBy, clientRequestId, summary, optional sectionIndex). Provenance schema links template (id/version/digest + rendered prompt artifacts), modelCalls (provider/model/endpoint/params/timestamps), inputs (repo/memory/document/run refs with hashes), and outputs (revision id + artifact list).

Final Summary

Mission intelligence cockpit

A compact command-center view of what was learned, what is risky, and what should happen next.

Ready for handoff

9

Risks

6

Next Steps

8

Stack Items

14

Tables

Outcome

Completed a conceptual v1 design for a local-first MCP server and Next.js Control Plane integrated with Antigravity, covering secure repo inspection, workflow/run execution with durable memory, and OpenAI-backed PRD/tech design generation with immutable revisions and provenance.

Risk Radar

Risks and mitigations

+
  • Antigravity MCP transport/client convention mismatches (stdio vs HTTP/SSE expectations) could break integration.
  • Accidental secret/API key exposure via tool outputs, logs, stored artifacts, or persisted context.
  • Filesystem sandbox escape via path traversal or symlink tricks during repo inspection could leak host files.
  • Incorrect idempotency or concurrency handling could create duplicated runs/docs or inconsistent status.
  • Non-deterministic LLM outputs and latency variance can undermine replayability and demo timing.
  • Schema/streaming model mismatches (JSON Schema subset support, event streaming expectations) can reduce compatibility and observability.
  • Artifact store and event/log volume can grow unbounded, impacting disk usage and performance if retention/GC is weak.
  • Repo search/stat operations can be slow on large repos (regex complexity, large file sets), causing timeouts or UI lag.
  • Cache staleness may return outdated repo inspection results on rapidly changing filesystems.

Execution Path

Recommended next steps

+
  • Finalize concrete MCP tool contracts and strict JSON Schemas (additionalProperties=false), including canonical error envelope, traceId, pagination, and idempotency keys.
  • Implement the local-first persistence stack (SQLite metadata + CAS artifact store) including idempotency records, optimistic concurrency, retention/GC, and export/import bundles with checksums (judge-mode support).
  • Build the MVP tool subset for the demo: repo.register/tree/read_file/search/stats, runs.start/get/list/events/log/attach_artifact, docs.generate/get/diff, context.session + memory.put/search.
  • Implement centralized redaction middleware for responses, logs, stored artifacts, and error payloads; add security regression tests (sandbox escape, secret leakage, denylisted paths).
  • Produce the Antigravity integration guide and a scripted, reproducible demo call sequence with fixed expected IDs/hashes and clear transport setup (stdio-first; HTTP/SSE optional).
  • Implement basic Control Plane UI flows: run timeline with event pagination, artifact viewer, documents list/revision/provenance/diff views, and repo browser panels with cache indicators.

Architecture

Technical foundation

+

Tech stack

  • Node.js
  • TypeScript
  • MCP server (stdio-first transport; optional HTTP + SSE)
  • Next.js (Control Plane UI)
  • SQLite
  • CAS filesystem artifact store (sha256-addressed artifacts)
  • OpenAI API (document generation pipeline)
  • JSON Schema (strict tool contracts)

Database

SQLite for canonical metadata/relationships/provenance pointers, paired with a content-addressed filesystem (CAS) artifact store for large blobs (documents/logs/json) referenced as artifact://sha256/<hash>.

Tables

  • projects
  • repos
  • missions
  • tasks
  • runs
  • run_events
  • documents
  • document_revisions
  • artifacts
  • memory_entries
  • embeddings
  • idempotency_records
  • export_bundles
  • auth_tenants

Project Shape

Suggested file structure

+
  • server/
  • server/src/tools/
  • server/src/tools/projects/
  • server/src/tools/repos/
  • server/src/tools/runs/
  • server/src/tools/docs/
  • server/src/tools/context/
  • server/src/tools/security/
  • server/src/persistence/sqlite/
  • server/src/persistence/cas/
  • server/src/transport/stdio/
  • server/src/transport/http-sse/
  • server/src/middleware/redaction/
  • server/src/schemas/
  • server/src/export-import/
  • control-plane/
  • control-plane/app/
  • control-plane/components/
  • control-plane/lib/mcp-client/
  • control-plane/pages/api/

Operating Model

Best practices and handoff path

+

Best practices

  • Default-deny security posture for filesystem access (MCP_ALLOWED_ROOT), network exposure (local-only), and secret handling (server-only keys).
  • Centralized redaction applied consistently to tool responses, logs, stored artifacts, and error payloads; never serialize env vars or auth headers.
  • Strict, deterministic tool schemas and contracts (JSON Schema, additionalProperties=false) with a canonical error envelope and traceId.
  • Idempotent mutations using persisted idempotency records keyed by tool/scope/clientRequestId (and payload hash), returning prior responses on replays.
  • Replayable, traceable runs using immutable snapshots and append-only RunEvent logs; every state transition emits an event.
  • Durable provenance for documents and runs (template digest, model params, input hashes, referenced artifacts).
  • Use CAS for large payloads to keep SQLite lean and enable immutable, verifiable references.
  • Resource limits and deterministic outputs for repo inspection (stable ordering, pagination/cursors, time/bytes/files caps).
  • Safe path handling: normalize relative paths, resolve real paths, block traversal and symlink escapes, and return clear sandbox errors.
  • Retention/GC with judge-mode exemptions, plus cursor pagination and CAS spill for high-volume logs/events.

How to use this context

  • mcp://server/describe
  • mcp://tools/list
  • mcp://tools/projects/*
  • mcp://tools/repos/register
  • mcp://tools/repos/list_files
  • mcp://tools/repos/read_file
  • mcp://tools/repos/search_text
  • mcp://tools/repos/summarize_structure
  • mcp://tools/repos/compute_stats
  • mcp://tools/runs/start_run
  • mcp://tools/runs/update_run_status
  • mcp://tools/runs/append_run_log
  • mcp://tools/runs/attach_artifact
  • mcp://tools/runs/list_runs
  • mcp://tools/runs/get_run
  • mcp://tools/docs/generate_prd
  • mcp://tools/docs/generate_tech_design
  • mcp://tools/docs/list_documents
  • mcp://tools/docs/get_document
  • mcp://tools/docs/diff_document
  • mcp://tools/context/session_open
  • mcp://tools/context/memory_put
  • mcp://tools/context/memory_search
  • mcp://tools/security/secrets_check
  • mcp://tools/security/auth_describe
  • artifact://sha256/<hash>