Comprehensive tutorial series for OpenClaw AI agent gateway
Fine-tune search results with MMR diversity, temporal decay, and result filtering.
After hybrid search retrieves candidates, OpenClaw applies post-processing:
Hybrid Search Results (top 50)
↓
Score Threshold
(filter low scores)
↓
Temporal Decay
(boost recent content)
↓
MMR Reranking
(diversify results)
↓
Final Results (top K)
MMR balances relevance with diversity — avoiding redundant results.
Without MMR:
Query: "deployment process"
Results:
1. "Deployment steps for staging" (score: 0.95)
2. "Deployment steps for production" (score: 0.94) ← Similar to #1
3. "Deployment checklist v2" (score: 0.93) ← Similar to #1 and #2
4. "Deployment troubleshooting" (score: 0.91) ← Finally different!
With MMR:
Query: "deployment process"
Results:
1. "Deployment steps for staging" (score: 0.95)
2. "Deployment troubleshooting" (score: 0.91) ← Diverse!
3. "CI/CD pipeline overview" (score: 0.88) ← Different angle
4. "Deployment checklist v2" (score: 0.93) ← Included but ranked lower
// ~/.openclaw/openclaw.json
{
agents: {
defaults: {
memorySearch: {
enabled: true,
provider: "gemini",
postProcess: {
mmr: {
enabled: true,
lambda: 0.7, // Balance: relevance vs diversity
},
},
},
},
},
}
Lambda (λ) controls the relevance-diversity tradeoff:
| Lambda | Behavior | Use Case |
|---|---|---|
| 1.0 | Pure relevance (no diversity) | When you want most similar |
| 0.7 | Mostly relevant, some diversity | General use (default) |
| 0.5 | Balanced | Exploratory searches |
| 0.3 | Mostly diverse | Brainstorming, research |
| 0.0 | Maximum diversity | Survey all topics |
MMR Formula:
score = λ * relevance(doc, query) - (1-λ) * max_similarity(doc, selected_docs)
Research/exploration (more diversity):
{
postProcess: {
mmr: {
enabled: true,
lambda: 0.5,
},
},
}
Precise lookup (less diversity):
{
postProcess: {
mmr: {
enabled: true,
lambda: 0.85,
},
},
}
Boost recent content — because newer information is often more relevant.
Today's notes → 1.0x multiplier (no decay)
Yesterday's notes → 0.97x multiplier
Last week's notes → 0.85x multiplier
Last month's notes → 0.5x multiplier
Last year's notes → 0.1x multiplier
{
agents: {
defaults: {
memorySearch: {
postProcess: {
temporal: {
enabled: true,
// Half-life in days (default: 30)
// After this many days, score is halved
halfLife: 30,
// Minimum multiplier (default: 0.1)
// Old content never drops below 10% of original score
floor: 0.1,
},
},
},
},
},
}
| Half-Life | 1 week ago | 1 month ago | 1 year ago | Use Case |
|---|---|---|---|---|
| 7 days | 0.5x | 0.06x | ~0x | Fast-moving projects |
| 30 days | 0.87x | 0.5x | 0.1x | Normal work (default) |
| 90 days | 0.95x | 0.79x | 0.26x | Long-term reference |
| 365 days | 0.99x | 0.94x | 0.5x | Archival content |
Sometimes you want historical content without decay:
# CLI override
openclaw memory search "original project requirements" --no-temporal
Or in agent configuration for a “historian” agent:
{
agents: {
historian: {
memorySearch: {
postProcess: {
temporal: { enabled: false },
},
},
},
},
}
Filter out low-relevance results before they waste context:
{
memorySearch: {
postProcess: {
threshold: {
// Minimum score to include (0-1 scale)
minScore: 0.3,
// Or minimum relative to top result
minRelative: 0.5, // Must be at least 50% of top score
},
},
},
}
Why threshold?
Control how many results to return:
{
memorySearch: {
// Maximum results returned to agent
topK: 10,
// Candidates to consider before post-processing
// (should be larger than topK for MMR to work well)
candidatePool: 50,
},
}
Guidelines:
topK: 5-15 for most use casescandidatePool: 3-5x of topK for good MMR diversityHow documents are split affects search quality:
{
memorySearch: {
chunking: {
// Target chunk size in tokens
chunkSize: 512,
// Overlap between chunks (helps context continuity)
chunkOverlap: 50,
// Respect document structure
strategy: "semantic", // or "fixed", "sentence"
},
},
}
| Strategy | Behavior | Best For |
|---|---|---|
fixed |
Split at exact token count | Uniform content |
sentence |
Split at sentence boundaries | Prose, notes |
semantic |
Split at paragraph/section breaks | Structured docs |
| Content Type | Chunk Size | Overlap | Why |
|---|---|---|---|
| Quick notes | 256 | 25 | Small, self-contained |
| Daily logs | 512 | 50 | Medium entries (default) |
| Documentation | 1024 | 100 | Longer explanations |
| Code files | 256 | 50 | Functions/blocks |
Control which files are indexed:
{
memorySearch: {
// Include patterns
include: ["*.md", "*.txt", "notes/**/*"],
// Exclude patterns
exclude: ["**/node_modules/**", "**/.git/**", "**/drafts/**"],
// Additional paths outside workspace
extraPaths: ["~/shared-notes", "/team/knowledge-base"],
},
}
Different agents may need different search settings:
{
agents: {
// Default for all agents
defaults: {
memorySearch: {
enabled: true,
provider: "gemini",
topK: 10,
postProcess: {
mmr: { enabled: true, lambda: 0.7 },
temporal: { enabled: true, halfLife: 30 },
},
},
},
// Research agent: more diversity, less recency bias
researcher: {
memorySearch: {
topK: 20,
postProcess: {
mmr: { enabled: true, lambda: 0.5 },
temporal: { enabled: false },
},
},
},
// Quick assistant: fewer results, strong recency
assistant: {
memorySearch: {
topK: 5,
postProcess: {
mmr: { enabled: true, lambda: 0.8 },
temporal: { enabled: true, halfLife: 7 },
},
},
},
},
}
# Run test queries and examine results
openclaw memory search "project deadline" --verbose --explain
# Output shows:
# - Raw scores before post-processing
# - Temporal decay multipliers applied
# - MMR diversity penalties
# - Final reranked scores
# Compare configurations
openclaw memory search "deployment" --mmr-lambda 0.5
openclaw memory search "deployment" --mmr-lambda 0.8
# Compare with/without temporal decay
openclaw memory search "requirements" --temporal
openclaw memory search "requirements" --no-temporal
# Check search statistics
openclaw memory stats
# Shows:
# - Average query latency
# - Results per query (mean, median)
# - Score distribution
# - Cache hit rate
// Full advanced search setup
{
agents: {
defaults: {
memorySearch: {
enabled: true,
provider: "gemini",
model: "gemini-embedding-001",
// Hybrid search
hybrid: {
enabled: true,
vectorWeight: 0.6,
keywordWeight: 0.4,
},
// Result limits
topK: 10,
candidatePool: 50,
// Chunking
chunking: {
chunkSize: 512,
chunkOverlap: 50,
strategy: "semantic",
},
// Post-processing pipeline
postProcess: {
// Score filtering
threshold: {
minScore: 0.3,
},
// Recency boost
temporal: {
enabled: true,
halfLife: 30,
floor: 0.1,
},
// Diversity
mmr: {
enabled: true,
lambda: 0.7,
},
},
// File filtering
include: ["*.md", "*.txt"],
exclude: ["**/drafts/**"],
},
},
},
}
Enable or tune MMR:
{
postProcess: {
mmr: { enabled: true, lambda: 0.5 }, // Lower lambda = more diversity
},
}
Temporal decay might be too aggressive:
{
postProcess: {
temporal: {
halfLife: 90, // Longer half-life
floor: 0.3, // Higher minimum score
},
},
}
Raise the threshold:
{
postProcess: {
threshold: { minScore: 0.5 },
},
}
Increase chunk overlap:
{
chunking: {
chunkOverlap: 100, // More overlap
},
}
For even more powerful search capabilities, including reranking and advanced BM25, see the QMD Backend →