Configuration
Configure mem-oracle settings and embedding providers.
Mem- Oracle is highly configurable to suit your needs. Configuration is stored in ~/.mem-oracle/config.json.
Default Configuration
{
"dataDir": "~/.mem-oracle",
"embedding": {
"provider": "local",
"model": "all-MiniLM-L6-v2",
"batchSize": 32
},
"vectorStore": {
"provider": "local",
"collectionPrefix": "mem-oracle"
},
"worker": {
"port": 7432,
"host": "127.0.0.1"
},
"crawler": {
"concurrency": 3,
"requestDelay": 500,
"timeout": 30000,
"maxPages": 1000,
"userAgent": "mem-oracle/1.0 (docs indexer)"
},
"hybrid": {
"enabled": true,
"alpha": 0.65,
"vectorTopK": 20,
"keywordTopK": 20
},
"retrieval": {
"maxChunksPerPage": 3,
"maxTotalChars": 32000,
"formatSnippets": true,
"snippetMaxChars": 2000
}
}Configuration Options
Data Directory
| Option | Default | Description |
|---|---|---|
dataDir | ~/.mem-oracle | Directory for storing all data |
The data directory contains:
~/.mem-oracle/
├── cache/ # Fetched page cache
├── vectors/ # Vector embeddings
├── metadata.db # SQLite database
├── worker.pid # Worker process ID
└── worker.log # Worker logsEmbedding Providers
Local (Default)
Uses TF-IDF for embeddings. No API key required.
{
"embedding": {
"provider": "local",
"model": "all-MiniLM-L6-v2",
"batchSize": 32
}
}OpenAI
{
"embedding": {
"provider": "openai",
"model": "text-embedding-3-small",
"apiKey": "sk-..."
}
}| Model | Dimensions | Cost |
|---|---|---|
text-embedding-3-small | 1536 | $0.00002/1K tokens |
text-embedding-3-large | 3072 | $0.00013/1K tokens |
text-embedding-ada-002 | 1536 | $0.0001/1K tokens |
Voyage AI
{
"embedding": {
"provider": "voyage",
"model": "voyage-2",
"apiKey": "..."
}
}| Model | Dimensions | Best For |
|---|---|---|
voyage-2 | 1024 | General purpose |
voyage-code-2 | 1536 | Code documentation |
Cohere
{
"embedding": {
"provider": "cohere",
"model": "embed-english-v3.0",
"apiKey": "..."
}
}Vector Store Settings
{
"vectorStore": {
"provider": "local",
"collectionPrefix": "mem-oracle"
}
}| Option | Default | Description |
|---|---|---|
provider | local | Vector store provider (local, qdrant, pinecone) |
url | - | Remote vector store URL |
apiKey | - | API key for remote provider |
collectionPrefix | mem-oracle | Collection/index name prefix |
Worker Settings
{
"worker": {
"port": 7432,
"host": "127.0.0.1"
}
}| Option | Default | Description |
|---|---|---|
port | 7432 | HTTP port for the worker |
host | 127.0.0.1 | Host to bind to |
Crawler Settings
{
"crawler": {
"concurrency": 3,
"requestDelay": 500,
"timeout": 30000,
"maxPages": 1000,
"userAgent": "mem-oracle/1.0 (docs indexer)"
}
}| Option | Default | Description |
|---|---|---|
concurrency | 3 | Number of concurrent requests |
requestDelay | 500 | Delay between requests (ms) |
timeout | 30000 | Request timeout (ms) |
maxPages | 1000 | Maximum pages per docset |
userAgent | mem-oracle/1.0 (docs indexer) | User agent string for crawler requests |
Hybrid Search Settings
Configure how vector and keyword search are combined for better results.
{
"hybrid": {
"enabled": true,
"alpha": 0.65,
"vectorTopK": 20,
"keywordTopK": 20,
"minKeywordScore": 0
}
}| Option | Default | Description |
|---|---|---|
enabled | true | Enable hybrid search (vector + keyword) |
alpha | 0.65 | Weight for vector score (0-1). Higher = more vector weight |
vectorTopK | 20 | Number of vector results to fetch before merging |
keywordTopK | 20 | Number of keyword results to fetch before merging |
minKeywordScore | 0 | Minimum keyword score threshold |
Retrieval Settings
Control how search results are formatted and returned for context injection.
{
"retrieval": {
"maxChunksPerPage": 3,
"maxTotalChars": 32000,
"formatSnippets": true,
"snippetMaxChars": 2000
}
}| Option | Default | Description |
|---|---|---|
maxChunksPerPage | 3 | Max chunks from same page (diversity) |
maxTotalChars | 32000 | Total character budget for all results |
formatSnippets | true | Include formatted snippets with metadata |
snippetMaxChars | 2000 | Max characters per individual snippet |
Diversity Filtering
The maxChunksPerPage setting prevents results from being dominated by many chunks from a single page. This ensures diverse results from multiple sources.
Character Budget
The maxTotalChars setting ensures predictable context injection size for Claude. Results are truncated intelligently at sentence or paragraph boundaries when the budget is exceeded.
Snippet Formatting
When formatSnippets is enabled, each result includes a pre-formatted snippet ready for context injection:
## Page Title
Source: https://example.com/docs/page
Section: Api > Authentication > OAuth Setup
The actual content goes here...Environment Variables
You can also configure via environment variables:
| Variable | Description |
|---|---|
MEM_ORACLE_PORT | Worker service port |
MEM_ORACLE_DATA_DIR | Data storage directory |
MEM_ORACLE_WORKER_URL | Override worker URL for plugin hooks |
MEM_ORACLE_IDLE_CHECK_MS | Idle shutdown check interval (ms) |
MEM_ORACLE_REPO_ROOT | Override repo path for plugin scripts |
MEM_ORACLE_TOP_K | Default number of results (OpenCode plugin) |
MEM_ORACLE_AUTO_INDEX | Auto-index detected URLs (OpenCode plugin) |
OPENAI_API_KEY | OpenAI API key |
VOYAGE_API_KEY | Voyage AI API key |
COHERE_API_KEY | Cohere API key |
Plugin Configuration
Claude Code Plugin
The Claude Code hooks read the worker URL from MEM_ORACLE_WORKER_URL (defaults to http://127.0.0.1:7432).
OpenCode Plugin Environment
| Variable | Default | Description |
|---|---|---|
MEM_ORACLE_PORT | 7432 | Worker service port |
MEM_ORACLE_DATA_DIR | ~/.mem-oracle | Data storage directory |
MEM_ORACLE_TOP_K | 5 | Number of snippets |
MEM_ORACLE_AUTO_INDEX | true | Auto-index URLs |
Recommendations
For Local Development
{
"embedding": {
"provider": "local"
},
"crawler": {
"concurrency": 5,
"maxPages": 500
}
}For Production Quality
{
"embedding": {
"provider": "openai",
"model": "text-embedding-3-small",
"apiKey": "sk-..."
},
"crawler": {
"concurrency": 3,
"requestDelay": 1000,
"maxPages": 2000
}
}For Code Documentation
{
"embedding": {
"provider": "voyage",
"model": "voyage-code-2",
"apiKey": "..."
}
}