From 4c3b0188d896509df0557dbc0756daa7d606bdac Mon Sep 17 00:00:00 2001 From: "alexei.dolgolyov" Date: Thu, 11 Jun 2026 01:11:26 +0300 Subject: [PATCH] docs(vex): split vex into its own reference, refresh for v1.16.0 - Add vex.md: install (prebuilt binaries + self-update), GPU/CUDA setup, jina-code+CUDA recommendation (CUDA essential, too slow on CPU), vex mcp install, full command set (bundle/paths/reachable/diff/history, search scope+metadata filters), CLAUDE.md integration, caveats - Shrink claude-code-tools.md section vex to a blurb + links - Note v1.16.0 capabilities in the vex-vs-ast-index benchmark (not re-benchmarked) - README: bump date, index vex.md, refresh vex descriptions --- README.md | 10 +- claude-code-tools.md | 96 +--------------- code-search-vex-vs-ast-index.md | 19 +++ vex.md | 197 ++++++++++++++++++++++++++++++++ 4 files changed, 227 insertions(+), 95 deletions(-) create mode 100644 vex.md diff --git a/README.md b/README.md index 73d3077..a38427d 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,6 @@ # Claude Code Facts -> Last updated: 2026-06-01 +> Last updated: 2026-06-11 A collection of interesting and useful facts, tips, tools, and guides for working with Claude and Claude Code. @@ -8,7 +8,7 @@ A collection of interesting and useful facts, tips, tools, and guides for workin ### [Claude Code Tools & Extensions](claude-code-tools.md) -MCP servers, skills, and plugins that extend Claude Code — including Context7 (library docs), AST Index, and vex (fast code search with semantic embeddings). +MCP servers, skills, and plugins that extend Claude Code — including Context7 (library docs), AST Index, and vex (hybrid code search; summarized here with a link to its full reference). ### [Gitea Release Workflow (Minimal)](gitea-release-workflow.md) @@ -26,6 +26,10 @@ Review of code signing options for Windows executables — Azure Trusted Signing How to configure parallel job capacity for Gitea Act Runner on TrueNAS. Covers the `CONFIG_FILE` env var requirement, custom `config.yaml` mount, and capacity tuning. +### [vex — Hybrid Code Search Reference](vex.md) + +Full reference for the `vex` code-search tool (current release **v1.16.0**) — install (prebuilt Windows/macOS/Linux binaries + `self-update`, or build-from-source), **GPU / CUDA acceleration** with the `jina-code` embedder recommendation, MCP-server registration (`vex mcp install`), the recommended `.vex.toml`, the complete command set (call graph, git history, scope/metadata search filters), and CLAUDE.md integration. + ### [Code Search: vex vs ast-index — Benchmark Notes](code-search-vex-vs-ast-index.md) -Point-in-time benchmark of `vex` and `ast-index` on a real mixed-language repo — indexing footprint, query latency, precision findings, and when to fall back from one to the other. Refreshed for `vex 1.9.1` / `ast-index 3.41.0`, with a summary of what flipped since the previous snapshot. +Point-in-time benchmark of `vex` and `ast-index` on a real mixed-language repo — indexing footprint, query latency, precision findings, and when to fall back from one to the other. Latency/quality tables pinned to `vex 1.9.1` / `ast-index 3.41.0`, with a summary of what flipped since the previous snapshot and a vex 1.16.0 capability-update note (see [vex.md](vex.md) for the current command set). diff --git a/claude-code-tools.md b/claude-code-tools.md index 8a9ca16..1fbebef 100644 --- a/claude-code-tools.md +++ b/claude-code-tools.md @@ -70,99 +70,11 @@ Fast code search skill for Claude Code — find classes, symbols, usages, implem ### vex -Fast hybrid code search — combines tree-sitter AST parsing, FST symbol lookup, BM25 body search, and optional ONNX semantic embeddings. Ships as a CLI (`vex search`, `vex usages`, `vex callers`, `vex similar`, `vex duplicates`, `vex pattern`, etc.) plus an optional MCP server (`vex-mcp`) that exposes the same operations to Claude Code as ~17 structured tools. Useful as a peer or replacement for AST Index when you want semantic / concept-level search in addition to exact symbol lookup. +Fast hybrid code search — tree-sitter AST parsing, FST symbol lookup, BM25 body search, a persistent call graph, optional ONNX semantic embeddings, and (v1.16) a query-time git-history walker. Ships as a CLI plus an optional `vex-mcp` MCP server exposing 20+ structured tools to Claude Code. A strong peer or replacement for AST Index when you want semantic / concept-level search, call-graph navigation, and symbol-level git archaeology alongside exact symbol lookup. -- **Repository:** -- **Languages:** 18+ via tree-sitter — Python, TypeScript/JavaScript, Go, Rust, Java, Kotlin, C#, Ruby, Swift, C++, PHP, Bash, Lua, CSS, HTML, YAML, TOML, SQL, Markdown. - -- **Install (Linux/macOS):** - - ```bash - brew tap tenatarika/tap && brew install vex - ``` - - Latest release also ships prebuilt tarballs for `linux-x86_64` and `darwin-aarch64`. - -- **Install (Windows — build from source, ~3 min):** - - As of v1.5.0 there is **no prebuilt Windows binary** in the release assets, so a Rust toolchain (≥1.80) is required: - - ```bash - git clone https://github.com/tenatarika/vex.git "$HOME/Documents/vex" - cd "$HOME/Documents/vex" - cargo build --release --workspace - cp target/release/vex.exe target/release/vex-mcp.exe "$HOME/.cargo/bin/" - ``` - - `~/.cargo/bin/` (i.e. `%USERPROFILE%\.cargo\bin`) is already on PATH on any system with `cargo` installed, so both binaries become globally callable immediately. - -- **Per-project bootstrap:** - - ```bash - cd - vex init # creates .vex.toml - vex index --path . # plain index, sub-second, no downloads - # or, for semantic / concept-level search: - vex index --path . --semantic # downloads ~86 MB ONNX model on first run - ``` - - Recommended `.vex.toml` for serious use: - - ```toml - semantic = true # enable meaning-based search (vex similar, vex duplicates, --semantic queries) - auto_update = true # incrementally refresh the index before each search - ``` - -- **Register `vex-mcp` with Claude Code (user scope — available across all projects):** - - ```bash - # Windows - claude mcp add vex -s user -- "$HOME/.cargo/bin/vex-mcp.exe" - - # macOS / Linux - claude mcp add vex -s user -- "$HOME/.cargo/bin/vex-mcp" - ``` - - `vex-mcp` defaults `VEX_ROOT` to the current working directory, so a single global registration works for every project — individual tool calls can also override per-call via a `project_root` argument. - -- **Key commands / MCP tools:** `search` (symbol + semantic), `show` (full body), `usages`, `callers`, `callees`, `implementations`, `outline`, `pattern` (ast-grep–style structural match), `similar` (semantic, requires `--semantic` index), `duplicates`, `status`, `grep` (regex fallback, no index needed), `update` (incremental reindex). - -- **Wire it into `CLAUDE.md` (recommended):** - - Installing the binaries and building an index isn't enough — Claude won't reliably prefer `vex` over `Grep`/`Read` unless a `CLAUDE.md` tells it to. - - **Start from upstream**, don't duplicate. The vex README maintains a canonical `CLAUDE.md` snippet — see [`README.md` → "CLAUDE.md Integration"](https://github.com/tenatarika/vex#claudemd-integration). Paste it into your **global** `~/.claude/CLAUDE.md` (applies to every project) or a **project-local** `./CLAUDE.md` (overrides global per repo). - - **Then graft on these local additions** — they go beyond what upstream covers and are worth keeping wherever you install vex: - - 1. **MCP server line** — add this near the top of the snippet so Claude knows the MCP route exists: - - > Equivalent **MCP tools** are also available via the `vex` MCP server when registered (`claude mcp add vex -s user`) — prefer them inside a Claude Code session for cheaper token output; use the Bash CLI from subagents, hooks, and scripts. - - 2. **Skip-vex rule** — add to the Rules block: - - > **Skip vex** for non-code text — config files (YAML/JSON/TOML), docs, log strings, free-form prose — and for files in languages vex doesn't parse. Fall back to the next tool in the priority chain that's actually capable for the case (typically Grep/Glob for free-form text). - - 3. **Subagent reminder** — add to the Rules block: - - > **Subagents** (Plan/Explore/general-purpose) don't inherit project CLAUDE.md or the parent's MCP toolset — explicitly tell them in the prompt to use `vex` via Bash for code search. - - 4. **`auto_update` accuracy fix** — upstream says "Run `vex update` after modifying source files if `auto_update` is not enabled." With `auto_update = true` it runs before *each search*, so during a Claude session the manual update is rarely needed. Reword as: - - > **Run `vex update` manually after editing** only if `auto_update = false`; with `auto_update = true`, vex auto-refreshes before each search. - - 5. **`--semantic` cost note** — append to the `vex index --semantic` line in the Indexing block: - - > downloads ~86 MB ONNX model on first run, enables `vex search --semantic`, `vex similar`, `vex duplicates`. - - **When to put it in global vs project CLAUDE.md:** - - - **Global (`~/.claude/CLAUDE.md`)** — preferred default once `vex-mcp` is registered at user scope. Vex becomes the recommendation for every project as soon as the index is built. - - **Project-local (`./CLAUDE.md`)** — for overrides (different fallback chain, monorepo-specific `--filter` paths, language-specific `--kind` defaults, or excluding vex on repos where it isn't set up). - -- **Caveats (as of v1.5.0):** vex is a young tool under active development. Known quirks: `implementations` doesn't detect generic-parameterized subclasses (`class Foo(Base[T])`); `usages` is text-flavored and matches inside comments/docstrings; first-time Windows install requires building from source. Verify load-bearing results against `ast-index` or `Grep`. The benchmark notes below capture more specific findings on a real repo. - -- **vex vs AST Index — when each wins:** Both tools cover similar ground, but they're not interchangeable. For a point-in-time head-to-head on a real mixed-language repo (Python / Kotlin / TypeScript / JavaScript) with measured latency, quality differences, and version-pinned findings, see [`code-search-vex-vs-ast-index.md`](code-search-vex-vs-ast-index.md). Headline: keep `vex` as primary, fall back to `ast-index changed --base ` for code-review diffs (no vex equivalent) and to `ast-index symbol`/`usages` when vex's textual matches are too noisy. +- **Repository:** · **Current release:** v1.16.0 (`vex self-update` to upgrade in place). +- **Full reference → [vex.md](vex.md)** — install (prebuilt Windows/macOS/Linux + `self-update`, or build-from-source), **GPU / CUDA acceleration** (strongly prefer `jina-code` + CUDA; it's too slow on CPU), `vex mcp install`, the recommended `.vex.toml`, the full command set (`bundle`, `paths`, `reachable`, `diff`, `history`, search scope/metadata filters), CLAUDE.md integration, and caveats. +- **Benchmark vs AST Index → [code-search-vex-vs-ast-index.md](code-search-vex-vs-ast-index.md)** — point-in-time head-to-head. Headline: keep `vex` primary; fall back to `ast-index changed --base ` for code-review diffs and to `ast-index symbol` / `usages` when you want textual matches. ### Packaged Skills diff --git a/code-search-vex-vs-ast-index.md b/code-search-vex-vs-ast-index.md index 94e1141..e6c0f53 100644 --- a/code-search-vex-vs-ast-index.md +++ b/code-search-vex-vs-ast-index.md @@ -11,6 +11,24 @@ > this revision. See ["Changes since the 2026-05-18 snapshot"](#changes-since-the-2026-05-18-snapshot) > at the bottom for a summary. +> **Capability update — vex 1.16.0 (capabilities only, NOT re-benchmarked):** the +> latency and quality tables below remain pinned to **vex 1.9.1**, but vex's feature +> surface has moved on. New since 1.9.1 (no fresh measurements taken here): +> +> - **`vex history `** (v1.16) — query-time git-log walker returning every +> historical version of a symbol (`--diff`, `--since`/`--until`, `--author`, `--kind`); +> opt-in indexed sidecar via `vex index --history`. No ast-index equivalent. +> - **`vex mcp install` / `uninstall` / `list`** (v1.15.0) — idempotent MCP-server +> registration for Claude Code / Cursor (replaces hand-editing the agent config). +> - **Scope & metadata search filters** on `vex search` / `vex usages` — `--kind`, +> `--include`/`--exclude`, `--since`/`--since-branched`/`--changed-only`, `--visibility`, +> `--async-only`, `--static-only`, `--sealed-only`, `--why`. +> - **`vex gpu`** execution-provider probe; **`vex watch`** continuous reindex; +> **`vex init --agents-md`**; prebuilt Windows `vex` + `vex-mcp` binaries. +> +> See **[vex.md](vex.md)** for the full current reference (install, GPU/CUDA, config, +> command set). Re-run the tables below on 1.16.0 before citing the numbers. + ## Test environment | Aspect | Value | @@ -94,6 +112,7 @@ Three real queries from the test repo (re-run on 2026-05-26): - `vex bundle --mode {symbol,pr-impact,project}` — one call replaces the `show → callers → callees → similar` round trip; pr-impact mode bundles changed symbols + transitive callers + reachable tests for code review. - `vex eval` — built-in ranking eval harness for CI regression. - `vex capabilities` — machine-readable feature matrix. + - **Since v1.10–v1.16 (see the capability-update note at the top + [vex.md](vex.md)):** `vex history` (symbol-level git archaeology), `vex mcp install`/`uninstall`/`list` (idempotent MCP registration), `vex gpu` (EP probe), `vex watch` (continuous reindex), and scope/metadata search filters (`--kind`, `--include`/`--exclude`, `--since`/`--changed-only`, `--visibility`, `--async-only`, `--why`). ## Practical recommendation diff --git a/vex.md b/vex.md new file mode 100644 index 0000000..8468d3c --- /dev/null +++ b/vex.md @@ -0,0 +1,197 @@ +# vex — Hybrid Structural + Semantic Code Search + +> **Current release:** v1.16.0 · Companion doc: [code-search-vex-vs-ast-index.md](code-search-vex-vs-ast-index.md) (vex-vs-ast-index benchmark notes). +> +> Split out of [claude-code-tools.md](claude-code-tools.md) — that file keeps a short blurb and links here. + +Fast hybrid code search — combines tree-sitter AST parsing, FST symbol lookup, BM25 body search, a persistent call graph, optional ONNX semantic embeddings, and (v1.16) a query-time git-history walker. Ships as a CLI (`vex search`, `vex usages`, `vex callers`, `vex bundle`, `vex history`, `vex pattern`, etc.) plus an optional MCP server (`vex-mcp`) that exposes the same operations to Claude Code as 20+ structured tools. Useful as a peer or replacement for AST Index when you want semantic / concept-level search, call-graph navigation, and symbol-level git archaeology in addition to exact symbol lookup. + +- **Repository:** +- **Current release:** v1.16.0 — verify with `vex self-update --check`, upgrade in place with `vex self-update` (works on Windows/macOS/Linux). +- **Languages:** 18+ via tree-sitter — Python, TypeScript/JavaScript, Go, Rust, Java, Kotlin, C#, Ruby, Swift, C++, PHP, Bash, Lua, CSS, HTML, YAML, TOML, SQL, Markdown. + +## Install + +- **Linux / macOS:** + + ```bash + brew tap tenatarika/tap && brew install vex + ``` + + The [release assets](https://github.com/tenatarika/vex/releases/latest) also ship prebuilt tarballs — as of v1.16.0, both `vex` and `vex-mcp` for `x86_64-unknown-linux-gnu`, `aarch64-apple-darwin`, and `x86_64-pc-windows-msvc`. + +- **Windows (prebuilt, no Rust needed):** + + Recent releases (confirmed through v1.16.0) ship **prebuilt Windows binaries** — `vex-x86_64-pc-windows-msvc.tar.gz` and `vex-mcp-x86_64-pc-windows-msvc.tar.gz`. Download both, drop `vex.exe` + `vex-mcp.exe` onto your PATH (e.g. `%USERPROFILE%\.cargo\bin` or any PATH dir), then keep them current in place: + + ```bash + vex self-update # --check previews the latest version; -y skips the prompt + ``` + + `vex self-update` replaces the running binary in place on Windows, macOS, and Linux — no package manager required. + +- **Build from source** (fallback — unreleased commits, or a **CUDA-enabled GPU build**; needs Rust ≥1.80): + + ```bash + git clone https://github.com/tenatarika/vex.git "$HOME/Documents/vex" + cd "$HOME/Documents/vex" + cargo build --release --workspace # add --features gpu-cuda for CUDA-accelerated embedding + cp target/release/vex.exe target/release/vex-mcp.exe "$HOME/.cargo/bin/" + ``` + + `~/.cargo/bin/` (i.e. `%USERPROFILE%\.cargo\bin`) is already on PATH on any system with `cargo` installed, so both binaries become globally callable immediately. + +## GPU / CUDA acceleration + +Semantic indexing (`--semantic`) is the only GPU-accelerated path — the GPU absorbs the heavier embedding cost on cold/large builds. Search queries themselves are CPU-bound and already fast. vex picks an ONNX Runtime **execution provider** at embed time: + +| Provider | Where it comes from | Notes | +|---|---|---| +| **CPU** | always available | fine for `minilm-l6-v2`; **impractically slow for `jina-code`** | +| **CUDA** | **source build only** — `cargo build --release --features gpu-cuda` + NVIDIA CUDA toolkit | the only GPU path that works with `jina-code` | +| **DirectML** | Windows prebuilt | **incompatible with `jina-code`** (see warning below) | +| **CoreML** | macOS prebuilt | Apple Silicon | + +**Setup (NVIDIA + CUDA, Windows or Linux):** + +1. Build vex with the CUDA feature — **the prebuilt binaries do not include CUDA**: + + ```bash + git clone https://github.com/tenatarika/vex.git && cd vex + cargo build --release --features gpu-cuda + cp target/release/vex.exe target/release/vex-mcp.exe "$HOME/.cargo/bin/" # drop .exe on Linux + ``` + + Requires a matching CUDA toolkit and the ONNX Runtime CUDA execution provider available to the binary. + +2. Verify the provider actually engages on this machine: + + ```bash + vex gpu # actively probes every compiled EP; --enable pins VEX_DEVICE for all projects + vex status # should print: GPU: yes (CUDA) + ``` + +3. Index on the GPU: + + ```bash + vex index --semantic --embedder jina-code --device cuda + ``` + +**⚠️ `jina-code` is incompatible with DirectML (Windows).** The code-specialized `jina-code` embedder has ops the DirectML EP cannot place — forcing `--gpu` / `--device directml` fails with `DML EP can only be used with CPU EPs`. So on Windows, `jina-code` runs on **CUDA (source build)** or **CPU only** — never on the prebuilt DirectML path. The default `device = "auto"` handles this gracefully (falls back to CPU) and is safe for `auto_update` incremental rebuilds. + +**Recommendation — `jina-code` + CUDA is strongly preferred, and CUDA is essential for it:** + +- **Strongly recommended whenever you have (or can build) CUDA on an NVIDIA GPU: `embedder = "jina-code"` + `device = "cuda"`.** The 768-dim, code-specialized model gives **noticeably better code-search results** than the default — this is the setup to aim for, and the GPU absorbs its heavier embed cost. +- **CUDA is essential for `jina-code`, not a nice-to-have.** On CPU the model is **too slow to be practical** — the cold embed of a real repo drags badly — and on Windows it can't use DirectML at all. So if a CUDA build is genuinely out of reach, **do not force `jina-code` onto CPU**; fall back to the default `embedder = "minilm-l6-v2"`, which is CPU-fast and good enough for symbol + semantic search. +- **Rule of thumb: CUDA → `jina-code`; no CUDA → `minilm-l6-v2`.** Never pair `jina-code` with CPU or DirectML. + +## Per-project setup + +```bash +cd +vex init # creates .vex.toml (add --agents-md to also emit an AGENTS.md for Cursor/Codex/Aider/Cline) +vex index --path . # plain index, sub-second, no downloads +# or, for semantic / concept-level search: +vex index --path . --semantic # downloads ~86 MB ONNX model on first run +# optional: bake a git-history sidecar so `vex history` is ~ms instead of shelling to git log: +vex index --path . --semantic --history +``` + +Recommended `.vex.toml` for serious use: + +```toml +semantic = true # enable meaning-based search (vex similar, vex duplicates, --semantic queries) +auto_update = true # incrementally refresh the index before each search when stale + +# Embedder — CUDA is the deciding factor (see "GPU / CUDA acceleration" above): +# CUDA available → jina-code (STRONGLY recommended: code-specialized 768-dim, best results) +# no CUDA (CPU/DML) → minilm-l6-v2 (default, CPU-fast; jina-code is too slow on CPU and can't use DirectML) +embedder = "jina-code" # alternatives: minilm-l6-v2 (default), bge-base-en-v1.5, bge-large-en-v1.5, mxbai-large +device = "cuda" # essential for jina-code; "auto" (default) falls back to CPU safely. Run `vex gpu` to probe. + +# Git history (v1.16): `vex history ` walks git log at QUERY TIME with no +# index and no config needed — it works even on a repo vex hasn't indexed yet, so +# there is intentionally no .vex.toml key to enable it. To make it FST-fast (~ms) +# on long-lived repos, build the opt-in indexed sidecar with `vex index --history` +# (adds ~5-30s to a cold build and ~10% to index size); `vex status` then shows a +# "History:" line. Re-run after a `vex self-update` so newer history extractors populate. +``` + +## Register the MCP server + +- **v1.15.0+ — one idempotent command:** + + ```bash + vex mcp install --agent claude-code # also: cursor, or `--agent all` to fan out across every supported agent + vex mcp list # show current entries across all supported agents + vex mcp uninstall --agent claude-code # idempotent removal + ``` + + `vex mcp install` defaults `--binary-path` to the `vex-mcp` sibling of the running `vex` (PATH fallback otherwise) and `--project-root` (→ `VEX_ROOT`) to the cwd. Use `--server-name` to register distinct entries per repo (`vex-api` / `vex-client`); `--dry-run` previews the write; `--force` overwrites a differing entry. Supported agents today: `claude-code`, `cursor` (with `codex-cli`, `windsurf`, `cline`, `continue`, `zed` landing). + +- **Manual equivalent** (still works — e.g. for a custom binary path or non-default scope): + + ```bash + # Windows + claude mcp add vex -s user -- "$HOME/.cargo/bin/vex-mcp.exe" + + # macOS / Linux + claude mcp add vex -s user -- "$HOME/.cargo/bin/vex-mcp" + ``` + + `vex-mcp` defaults `VEX_ROOT` to the current working directory, so a single global registration works for every project — individual tool calls can also override per-call via a `project_root` argument. + +## Command reference + +- **Key commands / MCP tools:** `search` (symbol + semantic, with scope/metadata filters — see below), `show` (full body), `usages` (`--strict` for binder-resolved refs), `callers`, `callees`, `implementations`, `outline`, `pattern` (ast-grep–style structural match), `similar` (semantic, requires `--semantic` index), `duplicates`, `check` (fast multi-symbol existence), `grep` (regex fallback, no index needed), `update` (incremental reindex), `status`, `capabilities` (machine-readable feature matrix). + +- **Call-graph & impact (call-graph index, ~ms queries):** `callers` / `callees` (direct), `paths --from A --to B` (all caller chains between two symbols), `reachable --target T` (everything that transitively reaches `T`), `bundle --mode {symbol,pr-impact,project}` (one call replaces the `show → callers → callees → similar` round trip; `pr-impact` bundles changed symbols + transitive callers + reachable tests for review), `diff --base ` (symbol-level git-rev diff). + +- **Git history (v1.16):** `history ` walks git log and returns every historical version of a symbol — query-time, no index needed, works even on un-indexed repos. Flags: `--diff` (unified diffs between consecutive versions), `--since` / `--until` (date), `--author`, `--kind`, `--branch`, `--depth`, `--limit`. Opt into an indexed sidecar with `vex index --history` for ~ms lookups. + +- **Search scope & metadata filters:** `vex search` and `vex usages` accept `--kind fn,method`, `--include` / `--exclude ''` (repeatable), `--since ` / `--since-branched` / `--changed-only` (restrict to git-changed files), plus on `search`: `--visibility pub|priv`, `--async-only` / `--no-async`, `--static-only`, `--sealed-only`, `--context-path ` (proximity boost), and `--why` (JSON trace of per-channel FST / BM25 / semantic hit counts — use when results look wrong). + +- **Maintenance / diagnostics:** `self-update` (in-place upgrade), `gpu [device]` (probe which execution provider actually engages; `--enable` pins `VEX_DEVICE`), `watch` (continuous incremental reindex), `completions `, `init --agents-md` (emit an `AGENTS.md`), `eval` (ranking-regression harness for CI). + +## Wire it into `CLAUDE.md` + +Installing the binaries and building an index isn't enough — Claude won't reliably prefer `vex` over `Grep`/`Read` unless a `CLAUDE.md` tells it to. + +**Start from upstream**, don't duplicate. The vex README maintains a canonical `CLAUDE.md` snippet — see [`README.md` → "CLAUDE.md Integration"](https://github.com/tenatarika/vex#claudemd-integration). Paste it into your **global** `~/.claude/CLAUDE.md` (applies to every project) or a **project-local** `./CLAUDE.md` (overrides global per repo). + +**Then graft on these local additions** — they go beyond what upstream covers and are worth keeping wherever you install vex: + +1. **MCP server line** — add this near the top of the snippet so Claude knows the MCP route exists: + + > Equivalent **MCP tools** are also available via the `vex` MCP server when registered (`vex mcp install --agent claude-code`, or manually `claude mcp add vex -s user`) — prefer them inside a Claude Code session for cheaper token output; use the Bash CLI from subagents, hooks, and scripts (and for CLI-only features like `pattern`, `paths`, `reachable`, `diff`, `history`, and the `--strict` / `--why` flags). + +2. **Skip-vex rule** — add to the Rules block: + + > **Skip vex** for non-code text — config files (YAML/JSON/TOML), docs, log strings, free-form prose — and for files in languages vex doesn't parse. Fall back to the next tool in the priority chain that's actually capable for the case (typically Grep/Glob for free-form text). + +3. **Subagent reminder** — add to the Rules block: + + > **Subagents** (Plan/Explore/general-purpose) don't inherit project CLAUDE.md or the parent's MCP toolset — explicitly tell them in the prompt to use `vex` via Bash for code search. + +4. **`auto_update` accuracy fix** — upstream says "Run `vex update` after modifying source files if `auto_update` is not enabled." With `auto_update = true` it runs before *each search*, so during a Claude session the manual update is rarely needed. Reword as: + + > **Run `vex update` manually after editing** only if `auto_update = false`; with `auto_update = true`, vex auto-refreshes before each search. + +5. **`--semantic` cost note** — append to the `vex index --semantic` line in the Indexing block: + + > downloads ~86 MB ONNX model on first run, enables `vex search --semantic`, `vex similar`, `vex duplicates`. + +**When to put it in global vs project CLAUDE.md:** + +- **Global (`~/.claude/CLAUDE.md`)** — preferred default once `vex-mcp` is registered at user scope. Vex becomes the recommendation for every project as soon as the index is built. +- **Project-local (`./CLAUDE.md`)** — for overrides (different fallback chain, monorepo-specific `--filter` paths, language-specific `--kind` defaults, or excluding vex on repos where it isn't set up). + +## Caveats (as of v1.16.0) + +vex is young but maturing fast — several earlier quirks are now fixed: `implementations` detects generic-parameterized subclasses (`class Foo(Base[T])`) since v1.7.0; `usages` is AST-precise by default with binder-resolved `--strict` refs (Rust/TS/Python/C#/C++) since v1.8.0 — no longer the text-flavored matcher it once was; prebuilt Windows binaries removed the build-from-source requirement. + +**Still invisible to `usages` in both modes:** decorator-based dispatch (`@router.get`), string-resolved targets (Celery task names, Uvicorn factory strings), reflection / `getattr`, and dynamic imports — before any rename or delete, backstop with `vex grep '\bName\b'`. Verify load-bearing results against `ast-index` or `Grep`. + +## vex vs AST Index — when each wins + +Both tools cover similar ground, but they're not interchangeable. For a point-in-time head-to-head on a real mixed-language repo (Python / Kotlin / TypeScript / JavaScript) with measured latency, quality differences, and version-pinned findings, see [`code-search-vex-vs-ast-index.md`](code-search-vex-vs-ast-index.md). Headline: keep `vex` as primary, fall back to `ast-index changed --base ` for code-review diffs (no vex equivalent) and to `ast-index symbol`/`usages` when vex's textual matches are too noisy.