A codebase intelligence engine for AI coding agents. Structural search, call graphs, impact analysis -- 87% fewer tokens.
AI coding agents -- Claude Code, Cursor, Copilot, aider -- depend on text search as their primary codebase navigation tool. This creates a compounding failure: grep returns raw lines, the agent reads files to understand context, repeats 50+ times per session. Measured waste ratios reach 60-80% of all tokens consumed (Nesler, 2026).
Hypergrep is a codebase intelligence engine that unifies indexed text search, a live code graph, semantic result compression, and predictive query prefetch into a single daemon. It answers the questions agents actually ask -- "who calls this function?", "what breaks if I change this?", "does this project use Redis?" -- in microseconds to milliseconds, returning structured results that fit within token budgets.
On ripgrep's own source code (208 files, 52K lines), Hypergrep achieves 4.4ms median warm search latency (7x faster than ripgrep for repeated queries), 87% token reduction in a realistic agent investigation task (20,580 tokens reduced to 2,814), and enables query types -- call graph traversal in 2.5 microseconds, bloom filter existence checks in 291 nanoseconds -- that no text search tool can answer at any speed.
This is not a faster grep. It is a different tool for a different interaction model.
AI agents waste most of their tokens on navigation, not on solving the actual problem.
The search-read-search loop: an agent greps for a pattern, gets 15 matching lines across 8 files (~500 tokens), reads each file for context (~8,000 tokens), reasons about relevance (~2,000 tokens), then acts on ~800 useful tokens. Total consumed: ~11,300 tokens. Useful: ~800. Waste ratio: 93%.
Nesler (2026) measured that 60-80% of tokens consumed by AI coding agents go toward figuring out where things are, not answering the actual question. A single question consumed ~12,000 tokens when the answer required ~800. The agent read 25 files to locate 3 functions.
The fundamental mismatch: agents think in tasks ("fix the auth bug") but grep answers "what lines contain this string?" Text search is a bad proxy for codebase understanding. Making grep faster does not fix this. A different tool is needed.
Six components in a unified index, queryable through one interface.
flowchart LR
Q["Query"] --> TF["Trigram Filter"]
TF --> C["Candidates"]
C --> RV["Regex Verify"]
RV --> M["Matches"]
M --> TS["Tree-sitter Expand"]
TS --> SR["Structural Results"]
M --> SC["Semantic Compress"]
SC --> LO["Layered Output"]
GQ["Graph Query"] --> BFS["BFS Call Graph"]
BFS --> IR["Impact Results"]
BL["Bloom Query"] --> BF["Bloom Filter"]
BF --> EX["O(1) Existence"]
Existing tools fall into isolated categories. No system unifies all four capabilities.
| System | Text search | Code graph | Structural | Predictive | Semantic compression | Agent-optimized |
|---|---|---|---|---|---|---|
| Google Code Search | Yes (trigram) | No | No | No | No | No |
| livegrep | Yes (suffix array) | No | No | No | No | No |
| Zoekt | Yes (positional trigram) | No | No | No | No | No |
| GitHub Blackbird | Yes (sparse n-gram) | No | No | No | No | No |
| Cursor | Yes (client n-gram) | No | No | No | No | Yes |
| ast-grep | Partial (no index) | No | Yes | No | No | No |
| Axon | No | Yes | Yes | No | No | Partial |
| codebase-memory-mcp | No | Yes | Yes | No | No | Partial |
| Hypergrep | Yes | Yes | Yes | Yes | Yes | Yes |
Six capabilities that do not exist in any other shipping tool.
One daemon maintains both a trigram text index and a live call/type/import graph. One index build, one filesystem watcher, one staleness model. Cross-cutting queries combine text search and graph traversal.
AST parsing runs only on files that match the text query, not the entire codebase. For a query matching 5 of 208 files, this skips 97% of parsing work. Structural search adds ~1ms overhead, not seconds.
Three layers of detail (L0: 15 tokens, L1: 80-120 tokens, L2: 200-800 tokens). Budget fitting selects top results within a token limit. Agents get maximum information density.
A compressed structural summary (~699 tokens) of the entire codebase: directory layout, key abstractions, entry points, hot spots. Loaded once at session start, eliminates 80% of exploratory searches.
"Does this codebase use Redis?" answered in 291 nanoseconds via bloom filter over concepts extracted from manifests and source. Zero false negatives guaranteed.
While the LLM generates its response (500ms-5s), the daemon speculatively executes the 3-5 most likely next queries. Rule-based predictor: function search predicts callers (~70% accuracy).
All numbers from real runs on ripgrep's source code (208 files, 52,266 lines). Nothing projected.
| Query | Matches | Cold (CLI) | Warm (daemon) |
|---|---|---|---|
fn search | 22 | ~100ms | 4.5ms |
impl.*Matcher | 43 | ~100ms | 4.5ms |
struct Config | 9 | ~100ms | 3.0ms |
use std | 141 | ~100ms | 6.9ms |
TODO | 6 | ~100ms | 0.5ms |
Searcher | 345 | ~100ms | 3.7ms |
fn.*new | 106 | ~100ms | 7.5ms |
print | 1,044 | ~100ms | 6.1ms |
unsafe | 7 | ~100ms | 0.4ms |
Result< | 542 | ~100ms | 4.9ms |
Task: agent needs to understand ripgrep's Matcher architecture.
rg "Matcher" -- 376 linesrg "impl.*Matcher" -- refine--model -- codebase map--layer 1 --budget 1000 "Matcher"--impact Matcher -- blast radiusProgressive disclosure -- agents start at L0 or L1 and drill down only as needed.
| Layer | Content | Tokens/result | Use case |
|---|---|---|---|
--layer 0 | File path + symbol name + kind | ~15 | "Which files are relevant?" |
--layer 1 | Signature + calls + called_by | ~80-120 | "What does this do?" |
--layer 2 | Full source code of enclosing function | ~200-800 | "I need to modify this" |
Example: hypergrep --layer 1 --budget 600 --json "fn search"
[
{
"file": "crates/searcher/src/searcher/mod.rs",
"name": "Searcher",
"kind": "impl",
"line_range": [627, 828],
"signature": "impl Searcher {",
"tokens": 32
},
{
"file": "crates/core/main.rs",
"name": "search",
"kind": "function",
"line_range": [107, 151],
"signature": "fn search(args: &HiArgs, mode: SearchMode) -> anyhow::Result<bool>",
"calls": ["search_path", "searcher", "printer", "walk_builder", "matcher"],
"called_by": ["search_parallel", "run", "try_main"],
"tokens": 189
}
]
~350 tokens. Without Hypergrep, getting this understanding requires reading main.rs (~2,000 tokens) and search.rs (~3,000 tokens). Budget fitting selects the top-ranked results that fit within the token limit using greedy selection.
BFS upstream through the call graph with severity classification.
$ hypergrep --impact "hash_password" src/
Impact analysis for 'hash_password' (depth 3):
[depth 1] WILL BREAK src/auth.rs:authenticate
[depth 2] MAY BREAK src/api.rs:login_handler
[depth 3] REVIEW src/main.rs:setup_routes
flowchart LR
HP["hash_password"] --> A["authenticate"]
A --> LH["login_handler"]
LH --> R["router"]
style HP fill:#2d2a24,color:#f5f0eb
style A fill:#b91c1c,color:#fff
style LH fill:#e8772e,color:#fff
style R fill:#a09a90,color:#fff
| Severity | Depth | Meaning |
|---|---|---|
| WILL BREAK | 1 | Direct callers -- signature or behavior change breaks these |
| MAY BREAK | 2 | Callers of callers -- may need adaptation |
| REVIEW | 3+ | Transitive dependents -- review for side effects |
Structural search + call graph for programming languages. Tree-sitter parsing for config/markup.
Unsupported languages fall back to line-level text search (same as ripgrep).
For agent sessions with 50+ queries. Keeps the index in memory for sub-millisecond searches.
# Start in background (auto-stops after 30 min idle)
hypergrep-daemon --background /path/to/project
# Check status
hypergrep-daemon --status /path/to/project
# Running
# PID: 18067
# Socket: /tmp/hypergrep-f983e88f.sock
# Memory: 8.5 MB
# Stop manually
hypergrep-daemon --stop /path/to/project
| Property | Value |
|---|---|
| CPU usage (idle) | 0% |
| Memory (208 files) | 8.5 MB |
| Auto-stop | 30 min idle (configurable: --idle-timeout) |
| Memory limit | 500 MB hard cap |
| Socket permissions | Owner-only (0600) |
| PID file | Prevents duplicate daemons per project |
| Scenario | Use |
|---|---|
| Quick one-off search | hypergrep "pattern" src/ (CLI) |
| AI agent session (50+ queries) | hypergrep-daemon --background src/ |
| CI/CD pipeline | hypergrep "pattern" src/ (CLI, no daemon) |
| Long coding session | hypergrep-daemon --background --idle-timeout 3600 src/ |
curl -sSfL https://github.com/marjoballabani/hypergrep/releases/latest/download/hypergrep-installer.sh | sh
git clone https://github.com/marjoballabani/hypergrep.git
cd hypergrep && ./install.sh
Requires Rust 1.75+ and a C compiler (for tree-sitter grammars).
cargo build --release
cp target/release/hypergrep ~/.cargo/bin/
| Command | Description | Example |
|---|---|---|
hypergrep "pattern" dir | Text search (ripgrep-compatible) | hypergrep "authenticate" src/ |
-s | Structural search (full function bodies) | hypergrep -s "authenticate" src/ |
-c | Count matches only | hypergrep -c "TODO" src/ |
-l | File names only | hypergrep -l "redis" src/ |
--layer N | Semantic compression (0, 1, or 2) | hypergrep --layer 1 "search" src/ |
--budget N | Token budget (best results in N tokens) | hypergrep --layer 1 --budget 500 "auth" src/ |
--json | JSON output for agents | hypergrep --layer 1 --json "search" src/ |
--callers | Reverse call graph | hypergrep --callers "authenticate" src/ |
--callees | Forward call graph | hypergrep --callees "authenticate" src/ |
--impact | Blast radius (what breaks?) | hypergrep --impact "hash_password" src/ |
--exists | Bloom filter existence check | hypergrep --exists "redis" src/ |
--model | Codebase mental model (~699 tokens) | hypergrep --model "" src/ |
--stats | Index statistics | hypergrep --stats "" src/ |
| Issue | Detail |
|---|---|
| Cold start slower than ripgrep | Text-only: 100ms vs ripgrep's 31ms. Structural: 1,250ms. The index pays for itself after ~40 queries. Use daemon mode for agent workloads. |
| Call graph is static analysis only | Dynamic dispatch, reflection, callbacks, and macros are not resolved. Impact results may be incomplete. |
| Bloom filter ~2% false positives | "YES" means "probably" -- confirm with a real search. "NO" is always correct (zero false negatives). |
| Large codebases (>10K files) | Need daemon mode. CLI cold start is too slow. |
| Memory usage | ~17 MB for text index, ~54 MB with full structural pass (208 files). Scales linearly. |
| 8 languages with full call graph | Other languages fall back to text search. No structural queries for unsupported grammars. |
Full theoretical foundations, prior art analysis, and quantitative projections: RESEARCH.md
| Reference | Contribution |
|---|---|
| Cox, R. (2012) | Trigram indexing for regex search. Decompose regex into required 3-char sequences, intersect posting lists. 196x speedup on the Linux kernel. |
| GitHub Blackbird (2023) | Sparse n-grams with inverse-frequency weighting. Eliminates the common-trigram problem at scale (45M repos, 115 TB). |
| Elhage, N. (2015) | Suffix arrays for regex search (livegrep). Substring matching via binary search over sorted suffixes. |
| Cursor (2025) | Client-side agent indexing. First system to frame code search indexing as an agent optimization problem. |
| Nesler, J. (2026) | Measured 60-80% of AI coding agent tokens wasted on navigation. 12,000 tokens consumed for an 800-token answer. |