Architecture

System Overview

Component Details

Client Layer

Claude Code: Primary integration via plugin hooks
OpenCode: Alternative editor integration
CLI: Direct command-line access

Integration Layer

Plugin Hooks: Lifecycle event handlers for automatic doc injection
MCP Server: Model Context Protocol for explicit tool calls

Service Layer

Worker Service: HTTP server handling all requests
Orchestrator: Coordinates indexing and retrieval operations

Processing Pipeline

Component	Responsibility
Fetcher	HTTP requests with caching and rate limiting
Extractor	HTML/Markdown parsing, content extraction
Chunker	Splits content into semantic chunks
Crawler	Discovers and queues linked pages

Embedding Layer

Provider	Type	Use Case
Local	TF-IDF	No API required, fast
OpenAI	Neural	High quality, general purpose
Voyage	Neural	Optimized for code
Cohere	Neural	Multi-language support

Storage Layer

Store	Purpose
SQLite	Docset and page metadata
Vector Store	Embedding vectors (JSON files)
Content Cache	Raw fetched content

Indexing Flow

Seed-First Strategy

Immediate: Index the seed page synchronously
Background: Crawl and index discovered pages asynchronously
Benefit: Users get immediate results while full indexing continues

Retrieval Flow

Data Flow

Data Models

Docset

interface Docset {
  id: string;
  name: string;
  baseUrl: string;
  seedSlug: string;
  status: 'indexing' | 'complete' | 'error';
  createdAt: Date;
  updatedAt: Date;
}

Page

interface Page {
  id: string;
  docsetId: string;
  url: string;
  title: string;
  status: 'pending' | 'indexed' | 'failed';
  contentHash: string;
  indexedAt: Date;
}

Chunk

interface Chunk {
  id: string;
  pageId: string;
  content: string;
  startIndex: number;
  endIndex: number;
  embedding: number[];
}

File Structure

Directory Structure

~/.mem-oracle/
├── config.json           # User configuration
├── metadata.db           # SQLite database
├── cache/                # Fetched content cache
│   └── {hash}.html      
├── vectors/              # Vector embeddings
│   └── {docsetId}/
│       └── {pageId}.json
├── worker.pid            # Worker process ID
└── worker.log            # Worker logs

Architecture

On this page