Architecture
Understanding how Mem-Oracle works under the hood.
- Claude Code: Primary integration via plugin hooks
- OpenCode: Alternative editor integration
- CLI: Direct command-line access
- Plugin Hooks: Lifecycle event handlers for automatic doc injection
- MCP Server: Model Context Protocol for explicit tool calls
- Worker Service: HTTP server handling all requests
- Orchestrator: Coordinates indexing and retrieval operations
| Component | Responsibility |
|---|
| Fetcher | HTTP requests with caching and rate limiting |
| Extractor | HTML/Markdown parsing, content extraction |
| Chunker | Splits content into semantic chunks |
| Crawler | Discovers and queues linked pages |
| Provider | Type | Use Case |
|---|
| Local | TF-IDF | No API required, fast |
| OpenAI | Neural | High quality, general purpose |
| Voyage | Neural | Optimized for code |
| Cohere | Neural | Multi-language support |
| Store | Purpose |
|---|
| SQLite | Docset and page metadata |
| Vector Store | Embedding vectors (JSON files) |
| Content Cache | Raw fetched content |
- Immediate: Index the seed page synchronously
- Background: Crawl and index discovered pages asynchronously
- Benefit: Users get immediate results while full indexing continues
interface Docset {
id: string;
name: string;
baseUrl: string;
seedSlug: string;
status: 'indexing' | 'complete' | 'error';
createdAt: Date;
updatedAt: Date;
}
interface Page {
id: string;
docsetId: string;
url: string;
title: string;
status: 'pending' | 'indexed' | 'failed';
contentHash: string;
indexedAt: Date;
}
interface Chunk {
id: string;
pageId: string;
content: string;
startIndex: number;
endIndex: number;
embedding: number[];
}
~/.mem-oracle/
├── config.json # User configuration
├── metadata.db # SQLite database
├── cache/ # Fetched content cache
│ └── {hash}.html
├── vectors/ # Vector embeddings
│ └── {docsetId}/
│ └── {pageId}.json
├── worker.pid # Worker process ID
└── worker.log # Worker logs