docs(vex): document history --depth vs index --history-depth
Clarify the two git-history depth knobs and how to keep indexing fast on repos with thousands of commits: - vex history --depth N — query-time walk, per file (caps latency) - vex index --history-depth N — one-time build, global newest-N (like git log -nN) Added to the setup example, the .vex.toml history comment, and the command reference in vex.md.
This commit is contained in:
@@ -95,6 +95,8 @@ vex index --path . # plain index, sub-second, no downloads
|
||||
vex index --path . --semantic # downloads ~86 MB ONNX model on first run
|
||||
# optional: bake a git-history sidecar so `vex history` is ~ms instead of shelling to git log:
|
||||
vex index --path . --semantic --history
|
||||
# thousands-of-commits repo? cap the one-time walk so the build stays fast (--history-depth implies --history):
|
||||
vex index --path . --semantic --history-depth 500
|
||||
```
|
||||
|
||||
Recommended `.vex.toml` for serious use:
|
||||
@@ -114,7 +116,9 @@ device = "cuda" # essential for jina-code; "auto" (default) falls bac
|
||||
# there is intentionally no .vex.toml key to enable it. To make it FST-fast (~ms)
|
||||
# on long-lived repos, build the opt-in indexed sidecar with `vex index --history`
|
||||
# (adds ~5-30s to a cold build and ~10% to index size); `vex status` then shows a
|
||||
# "History:" line. Re-run after a `vex self-update` so newer history extractors populate.
|
||||
# "History:" line. On a repo with thousands of commits, cap the build with
|
||||
# `vex index --history-depth 500` (global newest-N, like `git log -n500`) so a
|
||||
# full-history walk doesn't blow up index time. Re-run after a `vex self-update`.
|
||||
```
|
||||
|
||||
## Register the MCP server
|
||||
@@ -147,7 +151,9 @@ device = "cuda" # essential for jina-code; "auto" (default) falls bac
|
||||
|
||||
- **Call-graph & impact (call-graph index, ~ms queries):** `callers` / `callees` (direct), `paths --from A --to B` (all caller chains between two symbols), `reachable --target T` (everything that transitively reaches `T`), `bundle --mode {symbol,pr-impact,project}` (one call replaces the `show → callers → callees → similar` round trip; `pr-impact` bundles changed symbols + transitive callers + reachable tests for review), `diff --base <rev>` (symbol-level git-rev diff).
|
||||
|
||||
- **Git history (v1.16):** `history <Symbol>` walks git log and returns every historical version of a symbol — query-time, no index needed, works even on un-indexed repos. Flags: `--diff` (unified diffs between consecutive versions), `--since` / `--until` (date), `--author`, `--kind`, `--branch`, `--depth`, `--limit`. Opt into an indexed sidecar with `vex index --history` for ~ms lookups.
|
||||
- **Git history (v1.16):** `history <Symbol>` walks git log and returns every historical version of a symbol — query-time, no index needed, works even on un-indexed repos. Flags: `--diff` (unified diffs between consecutive versions), `--since` / `--until` (date), `--author`, `--kind`, `--branch`, `--depth <N>` (max commits **per file** — bump down to cap query latency), `--limit <N>` (cap total results). Opt into an indexed sidecar with `vex index --history` for ~ms lookups.
|
||||
|
||||
**Taming history cost on large repos:** the two depth knobs are different. `vex history --depth <N>` limits the *query-time* walk **per file**. `vex index --history-depth <N>` caps the *one-time index build* at the **global** newest-N commits (mirrors `git log -nN`, implies `--history`); both are unbounded by default. On a repo with thousands of commits, build with e.g. `vex index --history-depth 500` so a full-history walk doesn't blow up index time and size.
|
||||
|
||||
- **Search scope & metadata filters:** `vex search` and `vex usages` accept `--kind fn,method`, `--include` / `--exclude '<glob>'` (repeatable), `--since <rev>` / `--since-branched` / `--changed-only` (restrict to git-changed files), plus on `search`: `--visibility pub|priv`, `--async-only` / `--no-async`, `--static-only`, `--sealed-only`, `--context-path <file>` (proximity boost), and `--why` (JSON trace of per-channel FST / BM25 / semantic hit counts — use when results look wrong).
|
||||
|
||||
|
||||
Reference in New Issue
Block a user