ROADMAP

Platform vision and development phases • Core Documentation

Zap Family Roadmap

Vision

Unify the three separate Zap projects (Zap, Zap-Mail, Zap-Cal) into a single integrated platform for personal communications, calendar, and writing assistance. The end state is a self-hosted "personal command centre" that handles email, calendar, document management, and AI-assisted writing from one interface.

Current Architecture (Feb 2026)

/var/www/zap/          Main Zap project (dashboard, shared infra)
/var/www/zap-mail/     Gmail client: focus inbox, sync, contacts, rules
/var/www/zap-cal/      Google Calendar client: event display, sync

All three share the same Google Cloud project and OAuth client ID but use separate ports (Zap-Mail: 8080, Zap-Cal: 8081) and separate token files. They are deployed on orcus.lan (Ubuntu server) behind Apache virtual hosts.

Pain Points

Three separate codebases with overlapping dependencies (Google PHP client, SQLite, config patterns)
OAuth confusion: Same client ID, different redirect URIs and scopes. Re-authorizing one can break the other if ports are mixed up. Documented extensively in zap-mail/README.md.
No shared authentication layer: Each project manages its own tokens
No unified UI: Switching between mail and calendar means different URLs
Writing tools scattered: Google Docs accessed via browser, Word docs require conversion, no integrated drafting/revision support

---

Phase 1: Immediate Tools (Q1 2026) -- IN PROGRESS

1a. Historical Email Backfill (DONE)

zap-mail/bin/backfill-history.php -- import older emails with custom Gmail queries
Supports --query, --after, --before, --dry-run, retry with backoff
Used to backfill all Charlie Hall emails (833) and Mexico/oil-related messages (4000+)

1b. Google Drive Integration (DONE)

Added drive.readonly OAuth scope to Zap-Mail
zap-mail/bin/search-drive.php -- search Drive by name/content, list revisions, export Google Docs as text
Supports --query, --type, --shared, --revisions=ID, --export=ID

1c. Project-Specific Timelines (DONE)

zap-mail/bin/mexico-chapter-timeline.php -- generates chronological timeline combining emails, Drive revisions, and Word doc uploads
Produces both console output and markdown file
Template can be adapted for other research projects

---

Phase 1d: Zap-Projects -- ACTIVE

A project management and AI chat history hub for the Zap platform. Lives at /var/www/zap/apps/zap-projects/.

Chat History & Deep Analysis (DONE)

23 chat transcripts ingested with full metadata (apps, topics, files, message counts)
348 plan files cross-referenced to chats
FTS5 full-text search across all transcripts
LLM summaries generated for all chats via delphi.lan (qwen3:30b)
Deep structured analysis (artifacts, decisions, unfinished work) for all chats
Live file watcher (zap-chat-watcher.service) auto-ingests new/modified transcripts via inotify
Cron job every 30 minutes as safety net for deep analysis
Web UI: chat history browser, transcript viewer, AI Q&A endpoint
Energystats asset inventory (charts, scripts, data files)
Link on orcus.lan home page

Project Hub (IN PROGRESS -- Feb 2026)

Database migration: SQLite to PostgreSQL (zap DB, cursor_ prefixed tables)
Project registry: First-class projects (auto-discovered from chats, manually enrichable)
Many-to-many: Chats and plans can belong to multiple projects
Web pages: Project index (card/list toggle), Cursor hub, individual project detail pages
REST JSON API: Full API with API key auth, ready for external access
MCP server: Python MCP server for Cursor/AI agent integration (local + remote modes)
Global Cursor rule: AI assistant proactively uses MCP tools across all workspaces
Docs viewer: Rendered markdown docs at /projects/docs/
Manual project creation: Create projects via UI modal with slug, aliases, match patterns
Project-chat discovery: Configurable aliases and match_patterns per project; rescan button + automatic matching during ingestion
Cursor title sync from paris.lan (DONE): sync-cursor-titles.php SSHes to paris.lan (where Cursor runs on Windows 11), reads state.vscdb per workspace, syncs actual Cursor sidebar titles and metadata (lines added/removed, files changed, mode, archive status) to Postgres. Runs every 5 minutes via cron. Detects title renames and logs history to cursor_chat_title_history.
4-tier title priority: display_title (user override) > cursor_title (synced from paris.lan) > llm_title (from deep analysis) > first_query_short (fallback)
Inline rename: Pencil icon on Cursor Hub and project detail pages to set display_title via PATCH API
LLM title drift detection: deep-analyse-chat.php detects when the LLM-generated title changes and logs it
MCP deployment verified (DONE): Project-level .cursor/mcp.json deployed to all 6 orcus workspaces. Global config on paris.lan at %USERPROFILE%\.cursor\mcp.json. Discovered that for SSH remote workspaces, the project-level config is the one that actually provides tools to the agent (global config only shows in UI). Full setup guide, troubleshooting, and architecture documented in apps/zap-projects/README.md.
Cross-project documentation (DONE): All project READMEs (philoenic, philanthropy-planner, prospecta, prospecta.cc, quickstep) updated with standardised cross-chat context block pointing to zap-projects MCP tools and web UI.
Plan auto-indexing (DONE): File watcher now monitors /home/jd/.cursor/plans/ and auto-indexes .plan.md files via --plans-only mode. Plans have their own FTS vector and GIN index. Chat FTS expanded to include full transcript text.
Cursor Hub UI polish (DONE): Plans list view working, datetime display with times (DD MMM HH:MM) for both chats and plans in card and list views.
Chat/plan links fixed (DONE): Chat transcript viewer migrated from SQLite to PostgreSQL. Plan viewer page created with markdown rendering. Both chats and plans are now clickable from the hub.
Lazy-load plans with pagination (DONE): Plans load 50 at a time with "Load more" button, matching chat pagination pattern. Tab selection (Chats/Plans) persists to localStorage. URL query param ?tab=plans supports deep-linking from plan-viewer back link.

Database Backup System (DONE -- Feb 2026)

Centralized backup infrastructure for all databases across the LAN.

PostgreSQL backups (cron/backup-pg.sh): Daily automated dumps with 7 daily + 4 weekly + 3 monthly rotation
Dual-channel notifications: Telegram + ntfy.sh on backup failure, credentials from ~/.credentials
Backup history tracking: backup_history table records all events (success/failure) with paths, sizes, errors
SQLite backup integration: Philoenic and zap-mail scripts updated to use shared notification library and log to database
Status pages: orcus.lan/databases.php (local inventory) and zap.orcus.lan/projects/databases.php (LAN-wide view with backup history)
Clickable backup folders: file:// protocol links to open NAS folders directly from browser (Windows registry file provided)

See plans: zap_project_hub_a03d3fc4.plan.md, project_hub_enhancements_17edd6b1.plan.md, plans_lazy_load_sticky_dc372c8e.plan.md, fix_chat_and_plan_links_cc7a4a3d.plan.md

Task Management (forthcoming)

Central task list queryable by: "What are my most important outstanding tasks?"
Tasks linked to apps (zap-mail, philoenic, etc.), chats, and plan files
Priority levels, due dates, dependencies
Initially CLI + simple web UI; grows into full project management

Why Zap-Projects Matters

Currently tasks are scattered across roadmaps, plan files, handover docs, and memory
No single view of "what should I work on next?"
Zap-Projects is the connective tissue between all other Zap apps
AI agents need to interrogate project history, chats, and plans across all workspaces

---

Phase 2: Zap-Writer Module -- ACTIVE (Feb 2026)

AI-assisted writing tool with document ingestion, RAG search, and editorial engines. Lives at /var/www/zap/apps/zap-writer/.

Completed (Feb 2026)

Google Docs ingestion: multi-tabbed doc fetching via Docs API v1, text extraction from paragraphs and tables, hierarchical path metadata
File ingestion: PDF, DOCX, TXT, MD, HTML, RTF support with text extraction and configurable chunking
pgvector embeddings: nomic-embed-text embeddings on titan (RTX 3060), HNSW index for cosine similarity search
Multi-source RAG search: 4-part synthesised answers (project docs with citations, LLM knowledge, SearXNG web + LLM, external library placeholder)
Editorial engines: edit harmonisation, fact-check, currency scanner, bibliography checker, chart manager (built for Mexico chapter)
LLM gateway integration: auto-routes embedding and chat requests to the best available Ollama server (delphi/phoebe/titan)
Projects: Hidden Money (philanthropy book, 424 chunks embedded), Mexico oil chapter (78 edits, 528 bibliography entries)

Forthcoming

External document ingestion (HIGH PRIORITY): Ingest books, reports, PDFs into the chunking + embedding pipeline. The CLI tool (ingest-file.php) already supports this -- needs to be run against the user's document collection.
Streaming responses: SSE-based streaming for long LLM synthesis calls (currently ~30-60s blocking).
Document version tracking: Track versions across Google Docs, Word uploads, and email attachments.
Writing assistant: LLM-powered drafting, summarisation, and revision suggestion within project context.
Cross-format support: Unified revision history across Google Docs, Word, and PDF.

Architecture

Backend: PHP 8.3 + Apache 2.4, PostgreSQL 17 with pgvector, Ollama LLMs via LLM gateway
Embedding: nomic-embed-text on titan (dedicated, avoids delphi model swapping)
Search: Tiered -- SearXNG -> Brave -> Google via SearchClient
LLM routing: Centralised gateway at llm.orcus.lan (delphi 2x3090, phoebe 3060, titan 3060)
Frontend: Vanilla JS, dark theme, tabbed workspace UI

---

Phase 2b: Zap-RAG Shared Service -- PLANNED (Feb 2026)

Standalone RAG service extracting and improving the retrieval-augmented generation from zap-writer. Any Zap app can consume it via REST API. All local LLMs, privacy-first with LAN-only mode. Design phase complete, informed by 15+ arxiv papers and three frontier model critiques.

Code: apps/zap-rag/ Full plan: apps/zap-rag/design/MASTER-PLAN.md Web UI (planned): https://rag.orcus.lan/

Build Phases

Phase 0: Scaffold service (vhost, bootstrap, database with pgvectorscale, extract RAG from zap-writer)
Phase 1: Ingestion quality -- late chunking (arxiv 2409.04701), RAPTOR semantic tree (arxiv 2401.18059), Chain-of-Density summaries, document tree/ToC, contextual retrieval, variable-granularity chunking, opt-in proposition/question indexing
Phase 2: Privacy / LAN-only mode (per-collection toggle)
Phase 3: Retrieval stack -- HyDE+Rocchio, query expansion, hybrid RRF, MMR, cross-encoder reranking (bge-reranker-v2-m3), CRAG with strip refinement, adaptive retrieval cutoff (CAR)
Phase 4: Synthesis quality -- chain-of-thought, citation verification, confidence scoring, source highlighting
Phase 5: SSE streaming
Phase 6: Conversation memory + multi-hop retrieval
Phase 7: Delphi data extraction integration (Marker PDF, web upload, ColPali visual retrieval)
Phase 8: Evaluation framework (RAGAS, frontier model judging via Claude/GPT/Perplexity, external reviewer UI)
Phase 9: Embedding model benchmarking (nomic-embed-text vs mxbai-embed-large vs arctic-embed vs BGE-M3)

Key Architectural Decisions

Develop on orcus, design for portability to dedicated container on phoebe
PostgreSQL + pgvector + pgvectorscale (orcus now, pg01.lan later)
Late chunking as primary embedding strategy (free contextual embeddings)
Cross-encoder reranking over LLM-based scoring (faster, more accurate)
Frontier models (Claude, GPT, Perplexity) for evaluation only, never production
Per-collection privacy toggle excluding cloud models and web search

---

Phase 3: Shared Infrastructure (Q3-Q4 2026)

3a. Unified OAuth

Single token management service shared by all Zap apps
One re-authorization flow covers all scopes (Gmail, Calendar, Drive, Contacts)
Token stored in one location, read by all apps
Eliminates the port-confusion problem entirely

3b. Shared Database Layer

Move from per-app SQLite to a shared database (possibly PostgreSQL or a unified SQLite with FTS5)
Cross-app queries: "show me emails about this calendar event" or "find the document attached to this email"

3c. Shared UI Components

Common navigation, authentication status, notification system
Unified search across mail, calendar, and documents

3d. Cross-Project Infrastructure

See Zap-Projects roadmap at /var/www/zap/apps/zap-projects/ROADMAP.md for cross-project initiatives:

Database backup refactoring and standardization
Integration with external servers (du1, du2)
Shared notification systems
Cross-project documentation standards

See Orcus server roadmap at /var/www/orcus.lan/ROADMAP.md for infrastructure:

Hot swap VM on titan.lan for disaster recovery
Delphi.lan LLM redundancy with cloud failover
System-wide monitoring and alerting

---

Phase 4: Project Unification (2027)

The Unified Zap Platform

Merge all three projects into a single codebase under /var/www/zap/:

/var/www/zap/
  apps/
    mail/           (from current zap-mail)
    cal/            (from current zap-cal)
    writer/         (new: document management + AI)
    contacts/       (extracted from zap-mail)
  shared/
    auth/           (unified OAuth)
    database/       (shared DB layer)
    api/            (internal API for cross-module communication)
    ui/             (common frontend components)
  config/
  web/              (single entry point with routing)

Migration Path

Extract shared code (OAuth, DB, config) into shared/ namespace
Convert each app to use shared services via dependency injection
Build a unified router that mounts each app at a path prefix (/mail, /cal, /writer)
Merge databases with migration scripts
Retire the separate /var/www/zap-mail/ and /var/www/zap-cal/ directories (keep as git history)

Benefits

One composer.json, one deployment
One OAuth flow for everything
Cross-module features (email-to-calendar, document-from-email, AI summaries of threads)
Single URL with navigation between modules
Easier maintenance and fewer things to break

---

ChatGPT / LLM Integration Details

API Approach

// Example: Summarise an email thread for a writing project
$client = new OpenAI\Client(getenv('OPENAI_API_KEY'));
$response = $client->chat()->create([
    'model' => 'gpt-4o',
    'messages' => [
        ['role' => 'system', 'content' => 'Summarise this email thread...'],
        ['role' => 'user', 'content' => $threadContent],
    ],
]);

Use Cases

Thread summarisation: "What does Charlie want me to do about the Mexico chapter?"
Version comparison: "What changed between v2.0 and v3.4 of Darley_MexicanOil?"
Draft generation: "Write a section on Mexico's petroleum product trade deficit based on these charts and emails"
Edit integration: "Apply Charlie's Word doc edits to my Google Doc version"
Research compilation: "Gather all data points about Mexico's oil production from these 50 emails"

Cost Considerations

GPT-4o: ~$2.50/1M input tokens, ~$10/1M output tokens
Typical email thread summary: ~2K tokens input, ~500 output = ~$0.01
Full chapter revision assistance: ~50K tokens = ~$0.60
Monthly budget estimate: $5-20 depending on usage intensity

---

Documentation Access

All Zap platform documentation is now accessible via web viewer at:

Web Interface: https://zap.orcus.lan/projects/zap-docs
Cross-Reference: See orcus.lan server documentation at https://orcus.lan/docs.php
File Location: /var/www/zap/docs/ (source files)

The web viewer provides enhanced navigation, search, and mobile-friendly access to all platform documentation including CHANGELOG, CODING_HISTORY, ROADMAP, TESTING, INFRASTRUCTURE, and topic guides.

---

Open Questions

Database choice: RESOLVED -- migrated to PostgreSQL 17 for concurrent access, full-text search (tsvector), pgvector (forthcoming), and JSONB support.
Frontend framework: Current PHP templates with vanilla JS. Move to a modern framework (Alpine.js? htmx? Vue?) for the unified UI?
Hosting: Continue on orcus.lan or consider cloud deployment for reliability?
Multi-user: Currently single-user (Julian's Gmail). Will this ever need multi-user support?

---

Quick Reference: Current CLI Tools

Zap-Mail (`/var/www/zap-mail/bin/`)

Script	Purpose
--------	---------
`backfill-history.php`	Import historical emails with custom Gmail queries
`search-drive.php`	Search Google Drive, list revisions, export docs
`mexico-chapter-timeline.php`	Generate timeline for Mexico chapter project
`detect-replies.php`	Detect which emails have been replied to
`sync-sent.php`	Sync sent messages for reply detection
`apply-categories.php`	Apply category rules to existing messages
`llm-categorize.php`	AI-powered email categorisation
`merge-labels.php`	Merge Gmail label data

Cron Jobs

cron/sync.php -- Main email sync (every minute)
cron/monitor-sync.php -- Alert if sync stops working

← Back to Zap Projects 🖥️ Orcus Server Docs →

📚 Zap Platform Documentation