# vex — Hybrid Structural + Semantic Code Search

> **Current release:** v1.16.0 · Companion doc: [code-search-vex-vs-ast-index.md](code-search-vex-vs-ast-index.md) (vex-vs-ast-index benchmark notes).
>
> Split out of [claude-code-tools.md](claude-code-tools.md) — that file keeps a short blurb and links here.

Fast hybrid code search — combines tree-sitter AST parsing, FST symbol lookup, BM25 body search, a persistent call graph, optional ONNX semantic embeddings, and (v1.16) a query-time git-history walker. Ships as a CLI (`vex search`, `vex usages`, `vex callers`, `vex bundle`, `vex history`, `vex pattern`, etc.) plus an optional MCP server (`vex-mcp`) that exposes the same operations to Claude Code as 20+ structured tools. Useful as a peer or replacement for AST Index when you want semantic / concept-level search, call-graph navigation, and symbol-level git archaeology in addition to exact symbol lookup.

- **Repository:** <https://github.com/tenatarika/vex>
- **Current release:** v1.16.0 — verify with `vex self-update --check`, upgrade in place with `vex self-update` (works on Windows/macOS/Linux).
- **Languages:** 18+ via tree-sitter — Python, TypeScript/JavaScript, Go, Rust, Java, Kotlin, C#, Ruby, Swift, C++, PHP, Bash, Lua, CSS, HTML, YAML, TOML, SQL, Markdown.

## Install

- **Linux / macOS:**

  ```bash
  brew tap tenatarika/tap && brew install vex
  ```

  The [release assets](https://github.com/tenatarika/vex/releases/latest) also ship prebuilt tarballs — as of v1.16.0, both `vex` and `vex-mcp` for `x86_64-unknown-linux-gnu`, `aarch64-apple-darwin`, and `x86_64-pc-windows-msvc`.

- **Windows (prebuilt, no Rust needed):**

  Recent releases (confirmed through v1.16.0) ship **prebuilt Windows binaries** — `vex-x86_64-pc-windows-msvc.tar.gz` and `vex-mcp-x86_64-pc-windows-msvc.tar.gz`. Download both, drop `vex.exe` + `vex-mcp.exe` onto your PATH (e.g. `%USERPROFILE%\.cargo\bin` or any PATH dir), then keep them current in place:

  ```bash
  vex self-update            # --check previews the latest version; -y skips the prompt
  ```

  `vex self-update` replaces the running binary in place on Windows, macOS, and Linux — no package manager required.

- **Build from source** (fallback — unreleased commits, or a **CUDA-enabled GPU build**; needs Rust ≥1.80):

  ```bash
  git clone https://github.com/tenatarika/vex.git "$HOME/Documents/vex"
  cd "$HOME/Documents/vex"
  cargo build --release --workspace        # add --features gpu-cuda for CUDA-accelerated embedding
  cp target/release/vex.exe target/release/vex-mcp.exe "$HOME/.cargo/bin/"
  ```

  `~/.cargo/bin/` (i.e. `%USERPROFILE%\.cargo\bin`) is already on PATH on any system with `cargo` installed, so both binaries become globally callable immediately.

## GPU / CUDA acceleration

Semantic indexing (`--semantic`) is the only GPU-accelerated path — the GPU absorbs the heavier embedding cost on cold/large builds. Search queries themselves are CPU-bound and already fast. vex picks an ONNX Runtime **execution provider** at embed time:

| Provider | Where it comes from | Notes |
|---|---|---|
| **CPU** | always available | fine for `minilm-l6-v2`; **impractically slow for `jina-code`** |
| **CUDA** | **source build only** — `cargo build --release --features gpu-cuda` + NVIDIA CUDA toolkit | the only GPU path that works with `jina-code` |
| **DirectML** | Windows prebuilt | **incompatible with `jina-code`** (see warning below) |
| **CoreML** | macOS prebuilt | Apple Silicon |

**Setup (NVIDIA + CUDA, Windows or Linux):**

1. Build vex with the CUDA feature — **the prebuilt binaries do not include CUDA**:

   ```bash
   git clone https://github.com/tenatarika/vex.git && cd vex
   cargo build --release --features gpu-cuda
   cp target/release/vex.exe target/release/vex-mcp.exe "$HOME/.cargo/bin/"   # drop .exe on Linux
   ```

   Requires a matching CUDA toolkit and the ONNX Runtime CUDA execution provider available to the binary.

2. Verify the provider actually engages on this machine:

   ```bash
   vex gpu                 # actively probes every compiled EP; --enable pins VEX_DEVICE for all projects
   vex status              # should print:  GPU: yes (CUDA)
   ```

3. Index on the GPU:

   ```bash
   vex index --semantic --embedder jina-code --device cuda
   ```

**⚠️ `jina-code` is incompatible with DirectML (Windows).** The code-specialized `jina-code` embedder has ops the DirectML EP cannot place — forcing `--gpu` / `--device directml` fails with `DML EP can only be used with CPU EPs`. So on Windows, `jina-code` runs on **CUDA (source build)** or **CPU only** — never on the prebuilt DirectML path. The default `device = "auto"` handles this gracefully (falls back to CPU) and is safe for `auto_update` incremental rebuilds.

**Recommendation — `jina-code` + CUDA is strongly preferred, and CUDA is essential for it:**

- **Strongly recommended whenever you have (or can build) CUDA on an NVIDIA GPU: `embedder = "jina-code"` + `device = "cuda"`.** The 768-dim, code-specialized model gives **noticeably better code-search results** than the default — this is the setup to aim for, and the GPU absorbs its heavier embed cost.
- **CUDA is essential for `jina-code`, not a nice-to-have.** On CPU the model is **too slow to be practical** — the cold embed of a real repo drags badly — and on Windows it can't use DirectML at all. So if a CUDA build is genuinely out of reach, **do not force `jina-code` onto CPU**; fall back to the default `embedder = "minilm-l6-v2"`, which is CPU-fast and good enough for symbol + semantic search.
- **Rule of thumb: CUDA → `jina-code`; no CUDA → `minilm-l6-v2`.** Never pair `jina-code` with CPU or DirectML.

## Per-project setup

```bash
cd <your-project>
vex init                         # creates .vex.toml (add --agents-md to also emit an AGENTS.md for Cursor/Codex/Aider/Cline)
vex index --path .               # plain index, sub-second, no downloads
# or, for semantic / concept-level search:
vex index --path . --semantic    # downloads ~86 MB ONNX model on first run
# optional: bake a git-history sidecar so `vex history` is ~ms instead of shelling to git log:
vex index --path . --semantic --history
# thousands-of-commits repo? cap the one-time walk so the build stays fast (--history-depth implies --history):
vex index --path . --semantic --history-depth 500
```

Recommended `.vex.toml` for serious use:

```toml
semantic = true            # enable meaning-based search (vex similar, vex duplicates, --semantic queries)
auto_update = true         # incrementally refresh the index before each search when stale

# Embedder — CUDA is the deciding factor (see "GPU / CUDA acceleration" above):
#   CUDA available     → jina-code     (STRONGLY recommended: code-specialized 768-dim, best results)
#   no CUDA (CPU/DML)  → minilm-l6-v2  (default, CPU-fast; jina-code is too slow on CPU and can't use DirectML)
embedder = "jina-code"     # alternatives: minilm-l6-v2 (default), bge-base-en-v1.5, bge-large-en-v1.5, mxbai-large
device = "cuda"            # essential for jina-code; "auto" (default) falls back to CPU safely. Run `vex gpu` to probe.

# Git history (v1.16): `vex history <Symbol>` walks git log at QUERY TIME with no
# index and no config needed — it works even on a repo vex hasn't indexed yet, so
# there is intentionally no .vex.toml key to enable it. To make it FST-fast (~ms)
# on long-lived repos, build the opt-in indexed sidecar with `vex index --history`
# (adds ~5-30s to a cold build and ~10% to index size); `vex status` then shows a
# "History:" line. On a repo with thousands of commits, cap the build with
# `vex index --history-depth 500` (global newest-N, like `git log -n500`) so a
# full-history walk doesn't blow up index time. Re-run after a `vex self-update`.
```

## Register the MCP server

- **v1.15.0+ — one idempotent command:**

  ```bash
  vex mcp install --agent claude-code     # also: cursor, or `--agent all` to fan out across every supported agent
  vex mcp list                            # show current entries across all supported agents
  vex mcp uninstall --agent claude-code   # idempotent removal
  ```

  `vex mcp install` defaults `--binary-path` to the `vex-mcp` sibling of the running `vex` (PATH fallback otherwise) and `--project-root` (→ `VEX_ROOT`) to the cwd. Use `--server-name` to register distinct entries per repo (`vex-api` / `vex-client`); `--dry-run` previews the write; `--force` overwrites a differing entry. Supported agents today: `claude-code`, `cursor` (with `codex-cli`, `windsurf`, `cline`, `continue`, `zed` landing).

- **Manual equivalent** (still works — e.g. for a custom binary path or non-default scope):

  ```bash
  # Windows
  claude mcp add vex -s user -- "$HOME/.cargo/bin/vex-mcp.exe"

  # macOS / Linux
  claude mcp add vex -s user -- "$HOME/.cargo/bin/vex-mcp"
  ```

  `vex-mcp` defaults `VEX_ROOT` to the current working directory, so a single global registration works for every project — individual tool calls can also override per-call via a `project_root` argument.

## Command reference

- **Key commands / MCP tools:** `search` (symbol + semantic, with scope/metadata filters — see below), `show` (full body), `usages` (`--strict` for binder-resolved refs), `callers`, `callees`, `implementations`, `outline`, `pattern` (ast-grep–style structural match), `similar` (semantic, requires `--semantic` index), `duplicates`, `check` (fast multi-symbol existence), `grep` (regex fallback, no index needed), `update` (incremental reindex), `status`, `capabilities` (machine-readable feature matrix).

- **Call-graph & impact (call-graph index, ~ms queries):** `callers` / `callees` (direct), `paths --from A --to B` (all caller chains between two symbols), `reachable --target T` (everything that transitively reaches `T`), `bundle --mode {symbol,pr-impact,project}` (one call replaces the `show → callers → callees → similar` round trip; `pr-impact` bundles changed symbols + transitive callers + reachable tests for review), `diff --base <rev>` (symbol-level git-rev diff).

- **Git history (v1.16):** `history <Symbol>` walks git log and returns every historical version of a symbol — query-time, no index needed, works even on un-indexed repos. Flags: `--diff` (unified diffs between consecutive versions), `--since` / `--until` (date), `--author`, `--kind`, `--branch`, `--depth <N>` (max commits **per file** — bump down to cap query latency), `--limit <N>` (cap total results). Opt into an indexed sidecar with `vex index --history` for ~ms lookups.

  **Taming history cost on large repos:** the two depth knobs are different. `vex history --depth <N>` limits the *query-time* walk **per file**. `vex index --history-depth <N>` caps the *one-time index build* at the **global** newest-N commits (mirrors `git log -nN`, implies `--history`); both are unbounded by default. On a repo with thousands of commits, build with e.g. `vex index --history-depth 500` so a full-history walk doesn't blow up index time and size.

- **Search scope & metadata filters:** `vex search` and `vex usages` accept `--kind fn,method`, `--include` / `--exclude '<glob>'` (repeatable), `--since <rev>` / `--since-branched` / `--changed-only` (restrict to git-changed files), plus on `search`: `--visibility pub|priv`, `--async-only` / `--no-async`, `--static-only`, `--sealed-only`, `--context-path <file>` (proximity boost), and `--why` (JSON trace of per-channel FST / BM25 / semantic hit counts — use when results look wrong).

- **Maintenance / diagnostics:** `self-update` (in-place upgrade), `gpu [device]` (probe which execution provider actually engages; `--enable` pins `VEX_DEVICE`), `watch` (continuous incremental reindex), `completions <shell>`, `init --agents-md` (emit an `AGENTS.md`), `eval` (ranking-regression harness for CI).

## Wire it into `CLAUDE.md`

Installing the binaries and building an index isn't enough — Claude won't reliably prefer `vex` over `Grep`/`Read` unless a `CLAUDE.md` tells it to.

**Start from upstream**, don't duplicate. The vex README maintains a canonical `CLAUDE.md` snippet — see [`README.md` → "CLAUDE.md Integration"](https://github.com/tenatarika/vex#claudemd-integration). Paste it into your **global** `~/.claude/CLAUDE.md` (applies to every project) or a **project-local** `./CLAUDE.md` (overrides global per repo).

**Then graft on these local additions** — they go beyond what upstream covers and are worth keeping wherever you install vex:

1. **MCP server line** — add this near the top of the snippet so Claude knows the MCP route exists:

   > Equivalent **MCP tools** are also available via the `vex` MCP server when registered (`vex mcp install --agent claude-code`, or manually `claude mcp add vex -s user`) — prefer them inside a Claude Code session for cheaper token output; use the Bash CLI from subagents, hooks, and scripts (and for CLI-only features like `pattern`, `paths`, `reachable`, `diff`, `history`, and the `--strict` / `--why` flags).

2. **Skip-vex rule** — add to the Rules block:

   > **Skip vex** for non-code text — config files (YAML/JSON/TOML), docs, log strings, free-form prose — and for files in languages vex doesn't parse. Fall back to the next tool in the priority chain that's actually capable for the case (typically Grep/Glob for free-form text).

3. **Subagent reminder** — add to the Rules block:

   > **Subagents** (Plan/Explore/general-purpose) don't inherit project CLAUDE.md or the parent's MCP toolset — explicitly tell them in the prompt to use `vex` via Bash for code search.

4. **`auto_update` accuracy fix** — upstream says "Run `vex update` after modifying source files if `auto_update` is not enabled." With `auto_update = true` it runs before *each search*, so during a Claude session the manual update is rarely needed. Reword as:

   > **Run `vex update` manually after editing** only if `auto_update = false`; with `auto_update = true`, vex auto-refreshes before each search.

5. **`--semantic` cost note** — append to the `vex index --semantic` line in the Indexing block:

   > downloads ~86 MB ONNX model on first run, enables `vex search --semantic`, `vex similar`, `vex duplicates`.

**When to put it in global vs project CLAUDE.md:**

- **Global (`~/.claude/CLAUDE.md`)** — preferred default once `vex-mcp` is registered at user scope. Vex becomes the recommendation for every project as soon as the index is built.
- **Project-local (`./CLAUDE.md`)** — for overrides (different fallback chain, monorepo-specific `--filter` paths, language-specific `--kind` defaults, or excluding vex on repos where it isn't set up).

## Caveats (as of v1.16.0)

vex is young but maturing fast — several earlier quirks are now fixed: `implementations` detects generic-parameterized subclasses (`class Foo(Base[T])`) since v1.7.0; `usages` is AST-precise by default with binder-resolved `--strict` refs (Rust/TS/Python/C#/C++) since v1.8.0 — no longer the text-flavored matcher it once was; prebuilt Windows binaries removed the build-from-source requirement.

**Still invisible to `usages` in both modes:** decorator-based dispatch (`@router.get`), string-resolved targets (Celery task names, Uvicorn factory strings), reflection / `getattr`, and dynamic imports — before any rename or delete, backstop with `vex grep '\bName\b'`. Verify load-bearing results against `ast-index` or `Grep`.

## vex vs AST Index — when each wins

Both tools cover similar ground, but they're not interchangeable. For a point-in-time head-to-head on a real mixed-language repo (Python / Kotlin / TypeScript / JavaScript) with measured latency, quality differences, and version-pinned findings, see [`code-search-vex-vs-ast-index.md`](code-search-vex-vs-ast-index.md). Headline: keep `vex` as primary, fall back to `ast-index changed --base <branch>` for code-review diffs (no vex equivalent) and to `ast-index symbol`/`usages` when vex's textual matches are too noisy.