Clarify the two git-history depth knobs and how to keep indexing fast on repos with thousands of commits: - vex history --depth N — query-time walk, per file (caps latency) - vex index --history-depth N — one-time build, global newest-N (like git log -nN) Added to the setup example, the .vex.toml history comment, and the command reference in vex.md.
16 KiB
vex — Hybrid Structural + Semantic Code Search
Current release: v1.16.0 · Companion doc: code-search-vex-vs-ast-index.md (vex-vs-ast-index benchmark notes).
Split out of claude-code-tools.md — that file keeps a short blurb and links here.
Fast hybrid code search — combines tree-sitter AST parsing, FST symbol lookup, BM25 body search, a persistent call graph, optional ONNX semantic embeddings, and (v1.16) a query-time git-history walker. Ships as a CLI (vex search, vex usages, vex callers, vex bundle, vex history, vex pattern, etc.) plus an optional MCP server (vex-mcp) that exposes the same operations to Claude Code as 20+ structured tools. Useful as a peer or replacement for AST Index when you want semantic / concept-level search, call-graph navigation, and symbol-level git archaeology in addition to exact symbol lookup.
- Repository: https://github.com/tenatarika/vex
- Current release: v1.16.0 — verify with
vex self-update --check, upgrade in place withvex self-update(works on Windows/macOS/Linux). - Languages: 18+ via tree-sitter — Python, TypeScript/JavaScript, Go, Rust, Java, Kotlin, C#, Ruby, Swift, C++, PHP, Bash, Lua, CSS, HTML, YAML, TOML, SQL, Markdown.
Install
-
Linux / macOS:
brew tap tenatarika/tap && brew install vexThe release assets also ship prebuilt tarballs — as of v1.16.0, both
vexandvex-mcpforx86_64-unknown-linux-gnu,aarch64-apple-darwin, andx86_64-pc-windows-msvc. -
Windows (prebuilt, no Rust needed):
Recent releases (confirmed through v1.16.0) ship prebuilt Windows binaries —
vex-x86_64-pc-windows-msvc.tar.gzandvex-mcp-x86_64-pc-windows-msvc.tar.gz. Download both, dropvex.exe+vex-mcp.exeonto your PATH (e.g.%USERPROFILE%\.cargo\binor any PATH dir), then keep them current in place:vex self-update # --check previews the latest version; -y skips the promptvex self-updatereplaces the running binary in place on Windows, macOS, and Linux — no package manager required. -
Build from source (fallback — unreleased commits, or a CUDA-enabled GPU build; needs Rust ≥1.80):
git clone https://github.com/tenatarika/vex.git "$HOME/Documents/vex" cd "$HOME/Documents/vex" cargo build --release --workspace # add --features gpu-cuda for CUDA-accelerated embedding cp target/release/vex.exe target/release/vex-mcp.exe "$HOME/.cargo/bin/"~/.cargo/bin/(i.e.%USERPROFILE%\.cargo\bin) is already on PATH on any system withcargoinstalled, so both binaries become globally callable immediately.
GPU / CUDA acceleration
Semantic indexing (--semantic) is the only GPU-accelerated path — the GPU absorbs the heavier embedding cost on cold/large builds. Search queries themselves are CPU-bound and already fast. vex picks an ONNX Runtime execution provider at embed time:
| Provider | Where it comes from | Notes |
|---|---|---|
| CPU | always available | fine for minilm-l6-v2; impractically slow for jina-code |
| CUDA | source build only — cargo build --release --features gpu-cuda + NVIDIA CUDA toolkit |
the only GPU path that works with jina-code |
| DirectML | Windows prebuilt | incompatible with jina-code (see warning below) |
| CoreML | macOS prebuilt | Apple Silicon |
Setup (NVIDIA + CUDA, Windows or Linux):
-
Build vex with the CUDA feature — the prebuilt binaries do not include CUDA:
git clone https://github.com/tenatarika/vex.git && cd vex cargo build --release --features gpu-cuda cp target/release/vex.exe target/release/vex-mcp.exe "$HOME/.cargo/bin/" # drop .exe on LinuxRequires a matching CUDA toolkit and the ONNX Runtime CUDA execution provider available to the binary.
-
Verify the provider actually engages on this machine:
vex gpu # actively probes every compiled EP; --enable pins VEX_DEVICE for all projects vex status # should print: GPU: yes (CUDA) -
Index on the GPU:
vex index --semantic --embedder jina-code --device cuda
⚠️ jina-code is incompatible with DirectML (Windows). The code-specialized jina-code embedder has ops the DirectML EP cannot place — forcing --gpu / --device directml fails with DML EP can only be used with CPU EPs. So on Windows, jina-code runs on CUDA (source build) or CPU only — never on the prebuilt DirectML path. The default device = "auto" handles this gracefully (falls back to CPU) and is safe for auto_update incremental rebuilds.
Recommendation — jina-code + CUDA is strongly preferred, and CUDA is essential for it:
- Strongly recommended whenever you have (or can build) CUDA on an NVIDIA GPU:
embedder = "jina-code"+device = "cuda". The 768-dim, code-specialized model gives noticeably better code-search results than the default — this is the setup to aim for, and the GPU absorbs its heavier embed cost. - CUDA is essential for
jina-code, not a nice-to-have. On CPU the model is too slow to be practical — the cold embed of a real repo drags badly — and on Windows it can't use DirectML at all. So if a CUDA build is genuinely out of reach, do not forcejina-codeonto CPU; fall back to the defaultembedder = "minilm-l6-v2", which is CPU-fast and good enough for symbol + semantic search. - Rule of thumb: CUDA →
jina-code; no CUDA →minilm-l6-v2. Never pairjina-codewith CPU or DirectML.
Per-project setup
cd <your-project>
vex init # creates .vex.toml (add --agents-md to also emit an AGENTS.md for Cursor/Codex/Aider/Cline)
vex index --path . # plain index, sub-second, no downloads
# or, for semantic / concept-level search:
vex index --path . --semantic # downloads ~86 MB ONNX model on first run
# optional: bake a git-history sidecar so `vex history` is ~ms instead of shelling to git log:
vex index --path . --semantic --history
# thousands-of-commits repo? cap the one-time walk so the build stays fast (--history-depth implies --history):
vex index --path . --semantic --history-depth 500
Recommended .vex.toml for serious use:
semantic = true # enable meaning-based search (vex similar, vex duplicates, --semantic queries)
auto_update = true # incrementally refresh the index before each search when stale
# Embedder — CUDA is the deciding factor (see "GPU / CUDA acceleration" above):
# CUDA available → jina-code (STRONGLY recommended: code-specialized 768-dim, best results)
# no CUDA (CPU/DML) → minilm-l6-v2 (default, CPU-fast; jina-code is too slow on CPU and can't use DirectML)
embedder = "jina-code" # alternatives: minilm-l6-v2 (default), bge-base-en-v1.5, bge-large-en-v1.5, mxbai-large
device = "cuda" # essential for jina-code; "auto" (default) falls back to CPU safely. Run `vex gpu` to probe.
# Git history (v1.16): `vex history <Symbol>` walks git log at QUERY TIME with no
# index and no config needed — it works even on a repo vex hasn't indexed yet, so
# there is intentionally no .vex.toml key to enable it. To make it FST-fast (~ms)
# on long-lived repos, build the opt-in indexed sidecar with `vex index --history`
# (adds ~5-30s to a cold build and ~10% to index size); `vex status` then shows a
# "History:" line. On a repo with thousands of commits, cap the build with
# `vex index --history-depth 500` (global newest-N, like `git log -n500`) so a
# full-history walk doesn't blow up index time. Re-run after a `vex self-update`.
Register the MCP server
-
v1.15.0+ — one idempotent command:
vex mcp install --agent claude-code # also: cursor, or `--agent all` to fan out across every supported agent vex mcp list # show current entries across all supported agents vex mcp uninstall --agent claude-code # idempotent removalvex mcp installdefaults--binary-pathto thevex-mcpsibling of the runningvex(PATH fallback otherwise) and--project-root(→VEX_ROOT) to the cwd. Use--server-nameto register distinct entries per repo (vex-api/vex-client);--dry-runpreviews the write;--forceoverwrites a differing entry. Supported agents today:claude-code,cursor(withcodex-cli,windsurf,cline,continue,zedlanding). -
Manual equivalent (still works — e.g. for a custom binary path or non-default scope):
# Windows claude mcp add vex -s user -- "$HOME/.cargo/bin/vex-mcp.exe" # macOS / Linux claude mcp add vex -s user -- "$HOME/.cargo/bin/vex-mcp"vex-mcpdefaultsVEX_ROOTto the current working directory, so a single global registration works for every project — individual tool calls can also override per-call via aproject_rootargument.
Command reference
-
Key commands / MCP tools:
search(symbol + semantic, with scope/metadata filters — see below),show(full body),usages(--strictfor binder-resolved refs),callers,callees,implementations,outline,pattern(ast-grep–style structural match),similar(semantic, requires--semanticindex),duplicates,check(fast multi-symbol existence),grep(regex fallback, no index needed),update(incremental reindex),status,capabilities(machine-readable feature matrix). -
Call-graph & impact (call-graph index, ~ms queries):
callers/callees(direct),paths --from A --to B(all caller chains between two symbols),reachable --target T(everything that transitively reachesT),bundle --mode {symbol,pr-impact,project}(one call replaces theshow → callers → callees → similarround trip;pr-impactbundles changed symbols + transitive callers + reachable tests for review),diff --base <rev>(symbol-level git-rev diff). -
Git history (v1.16):
history <Symbol>walks git log and returns every historical version of a symbol — query-time, no index needed, works even on un-indexed repos. Flags:--diff(unified diffs between consecutive versions),--since/--until(date),--author,--kind,--branch,--depth <N>(max commits per file — bump down to cap query latency),--limit <N>(cap total results). Opt into an indexed sidecar withvex index --historyfor ~ms lookups.Taming history cost on large repos: the two depth knobs are different.
vex history --depth <N>limits the query-time walk per file.vex index --history-depth <N>caps the one-time index build at the global newest-N commits (mirrorsgit log -nN, implies--history); both are unbounded by default. On a repo with thousands of commits, build with e.g.vex index --history-depth 500so a full-history walk doesn't blow up index time and size. -
Search scope & metadata filters:
vex searchandvex usagesaccept--kind fn,method,--include/--exclude '<glob>'(repeatable),--since <rev>/--since-branched/--changed-only(restrict to git-changed files), plus onsearch:--visibility pub|priv,--async-only/--no-async,--static-only,--sealed-only,--context-path <file>(proximity boost), and--why(JSON trace of per-channel FST / BM25 / semantic hit counts — use when results look wrong). -
Maintenance / diagnostics:
self-update(in-place upgrade),gpu [device](probe which execution provider actually engages;--enablepinsVEX_DEVICE),watch(continuous incremental reindex),completions <shell>,init --agents-md(emit anAGENTS.md),eval(ranking-regression harness for CI).
Wire it into CLAUDE.md
Installing the binaries and building an index isn't enough — Claude won't reliably prefer vex over Grep/Read unless a CLAUDE.md tells it to.
Start from upstream, don't duplicate. The vex README maintains a canonical CLAUDE.md snippet — see README.md → "CLAUDE.md Integration". Paste it into your global ~/.claude/CLAUDE.md (applies to every project) or a project-local ./CLAUDE.md (overrides global per repo).
Then graft on these local additions — they go beyond what upstream covers and are worth keeping wherever you install vex:
-
MCP server line — add this near the top of the snippet so Claude knows the MCP route exists:
Equivalent MCP tools are also available via the
vexMCP server when registered (vex mcp install --agent claude-code, or manuallyclaude mcp add vex -s user) — prefer them inside a Claude Code session for cheaper token output; use the Bash CLI from subagents, hooks, and scripts (and for CLI-only features likepattern,paths,reachable,diff,history, and the--strict/--whyflags). -
Skip-vex rule — add to the Rules block:
Skip vex for non-code text — config files (YAML/JSON/TOML), docs, log strings, free-form prose — and for files in languages vex doesn't parse. Fall back to the next tool in the priority chain that's actually capable for the case (typically Grep/Glob for free-form text).
-
Subagent reminder — add to the Rules block:
Subagents (Plan/Explore/general-purpose) don't inherit project CLAUDE.md or the parent's MCP toolset — explicitly tell them in the prompt to use
vexvia Bash for code search. -
auto_updateaccuracy fix — upstream says "Runvex updateafter modifying source files ifauto_updateis not enabled." Withauto_update = trueit runs before each search, so during a Claude session the manual update is rarely needed. Reword as:Run
vex updatemanually after editing only ifauto_update = false; withauto_update = true, vex auto-refreshes before each search. -
--semanticcost note — append to thevex index --semanticline in the Indexing block:downloads ~86 MB ONNX model on first run, enables
vex search --semantic,vex similar,vex duplicates.
When to put it in global vs project CLAUDE.md:
- Global (
~/.claude/CLAUDE.md) — preferred default oncevex-mcpis registered at user scope. Vex becomes the recommendation for every project as soon as the index is built. - Project-local (
./CLAUDE.md) — for overrides (different fallback chain, monorepo-specific--filterpaths, language-specific--kinddefaults, or excluding vex on repos where it isn't set up).
Caveats (as of v1.16.0)
vex is young but maturing fast — several earlier quirks are now fixed: implementations detects generic-parameterized subclasses (class Foo(Base[T])) since v1.7.0; usages is AST-precise by default with binder-resolved --strict refs (Rust/TS/Python/C#/C++) since v1.8.0 — no longer the text-flavored matcher it once was; prebuilt Windows binaries removed the build-from-source requirement.
Still invisible to usages in both modes: decorator-based dispatch (@router.get), string-resolved targets (Celery task names, Uvicorn factory strings), reflection / getattr, and dynamic imports — before any rename or delete, backstop with vex grep '\bName\b'. Verify load-bearing results against ast-index or Grep.
vex vs AST Index — when each wins
Both tools cover similar ground, but they're not interchangeable. For a point-in-time head-to-head on a real mixed-language repo (Python / Kotlin / TypeScript / JavaScript) with measured latency, quality differences, and version-pinned findings, see code-search-vex-vs-ast-index.md. Headline: keep vex as primary, fall back to ast-index changed --base <branch> for code-review diffs (no vex equivalent) and to ast-index symbol/usages when vex's textual matches are too noisy.