` and `` contain a Russian-formatted full date (`05 мая 2026`) — use as authoritative when ambiguous. |
| `IsLive` | event detail / listing | `[data-live="true"]` attribute. Live events also carry `.score-state` and `.time` elements with `2:1` and `83:30` style content. |
| `LiveScore` | event detail (live only) | `.score-state` text (`2:1 (1:1)` style). Inning breakdown: parse the `eventJsonInfo` `[data-json]` attribute on the hidden `| ` — JSON includes `mainScore`, `inningScore[]`, `matchTime.seconds`, `matchIsComplete`. |
| `MatchIsComplete` | event detail | Decoded JSON of `[data-mutable-id="eventJsonInfo"][data-json]` → `.matchIsComplete` boolean. Critical for Phase 8 (Results loader). |
| `FinalScore` | event detail (post-match) | Same `eventJsonInfo` JSON → `.resultDescription` (e.g., `"2:1 (1:1)"`) when `matchIsComplete=true`. |
---
## 2. Match-Scope Bets (1×2, Handicap, Total)
The event-detail "main row" presents three primary markets in a `coefficients-table`:
**Result** (1×2), **Handicap** (Win-Fora), **Total** (Goals/Points/Games depending
on sport). These map to spec fields `Bet_Match_*`.
### 2.1 Match Win 1 / Draw / Win 2
| Spec field | data-selection-key suffix | DOM path |
|---|---|---|
| `Bet_Match_Win_1` | `@Match_Result.1` (football, tennis, hockey) **OR** `@Result.1` (basketball pre-match) **OR** `@Normal_Time_Result.1` (basketball detail) | `evt span[data-selection-key$='@Match_Result.1']@data-selection-price` (decimal odds, e.g., `1.65`) |
| `Bet_Match_Draw` | `.draw` outcome of same market | `evt span[data-selection-key$='@Match_Result.draw']@data-selection-price`. **NULL for tennis** (2-way market, no draw). |
| `Bet_Match_Win_2` | `.3` outcome | `evt span[data-selection-key$='@Match_Result.3']@data-selection-price` |
**Sport variance:**
- Football, Tennis, Table-tennis: `Match_Result`.
- Basketball: in pre-match landing, label is `Match_Winner_Including_All_OT.HB_H/HB_A`
(2-way, OT included). On the detail page, both `Normal_Time_Result.{1,draw,3}` (3-way,
reg time) and `Match_Winner_Including_All_OT.{HB_H,HB_A}` (2-way, OT included) appear.
**Recommendation:** treat `Match_Winner_Including_All_OT` as the canonical Win-1 / Win-2
(no Draw) when a 3-way `Result` market is absent; fall back to draw-included
`Normal_Time_Result` when present.
- Hockey: TBD — verify in Phase 3 with an actual hockey event capture.
**Recommendation for Phase 1 domain:** define `BetType.WinDraw` allowing nullable
`Draw`. The Excel exporter writes empty cell when `Draw` is null.
### 2.2 Match Win Fora (handicap)
| Spec field | data-selection-key suffix | DOM path | Value source |
|---|---|---|---|
| `Bet_Match_Win_Fora_1_Value` | — | (no selection key for value alone) | ` | ` of HB_H selection: `.middle-simple` text inside the ` ` (e.g., `(-1.0)`). Strip parens, parse as `decimal`. |
| `Bet_Match_Win_Fora_1_Rate` | `@To_Win_Match_With_Handicap{N}.HB_H` (or `@Match_Handicap.HB_H` variant) | `[data-selection-key$='@To_Win_Match_With_Handicap.HB_H']@data-selection-price` | — |
| `Bet_Match_Win_Fora_2_Value` | — | `.middle-simple` next to HB_A selection (e.g., `(+1.0)`). | — |
| `Bet_Match_Win_Fora_2_Rate` | `@To_Win_Match_With_Handicap{N}.HB_A` | `[data-selection-key$='@To_Win_Match_With_Handicap.HB_A']@data-selection-price` | — |
**Tennis variant:** uses `@To_Win_Match_With_Handicap_By_Games{N}.HB_H/HB_A`.
The handicap is in **games** not points — emit `Value` as-is, the unit is implicit
in the sport.
**Multi-line handicap:** the site offers many lines (`To_Win_Match_With_Handicap0`,
`...1`, `...2`, ...), each a different handicap value. The customer spec wants only
the **main line** (the one displayed in the listing's main row). Phase 3 should:
1. On listing pages, take the handicap displayed in the `coefficients-table`
`data-market-type="HANDICAP"` cell.
2. On event detail, identify the "main" line as the one without a numeric suffix
(`@To_Win_Match_With_Handicap.HB_H`) or with suffix `0` if both exist — sample
shows both `To_Win_Match_With_Handicap.HB_H` and `...0.HB_H`. Heuristic: pick
the line whose handicap value is closest to ±1.0 from the favorite, OR explicitly
prefer the no-suffix variant; fall back to suffix `0`.
3. Optional: capture the full handicap ladder into a separate normalized table
so anomaly detection can use the spread, even if Excel only exports the main line.
### 2.3 Match Total Less / More
| Spec field | data-selection-key suffix | DOM path |
|---|---|---|
| `Bet_Match_Total_Less_Value` | — | `.middle-simple` next to the `Меньше` selection (e.g., `3.5`, `213.5`). |
| `Bet_Match_Total_Less_Rate` | `@Total_{Goals\|Points\|Games}{N}.Under_` | `[data-selection-key^='@Total_'][data-selection-key$='.Under_']@data-selection-price`. Use the row whose Value equals the chosen total threshold. |
| `Bet_Match_Total_More_Value` | — | Same value as Less (paired). |
| `Bet_Match_Total_More_Rate` | `@Total_{Goals\|Points\|Games}{N}.Over_` | `[data-selection-key$='.Over_']@data-selection-price` |
**Sport vocabulary:**
- Football: `Total_Goals`
- Basketball: `Total_Points`
- Tennis: `Total_Games`
- Hockey: `Total_Goals` (TBD)
- Volleyball / handball: TBD
**Choosing the "main" total line:** customer spec wants ONE Total Value + Less/More
rates per event. The site offers ~20 different total thresholds per event. The
listing page main row exposes the "headline" total (the one the bookmaker chose
to show). **Heuristic:**
1. On listing: read the `data-market-type="TOTAL"` cell directly.
2. On event detail: find the row labeled in `coefficients-row` (visible main view),
not in `coefficients-hidden-row`. The `data-mutable-id="S_3_1_european"` /
`S_3_3_european` pair is the main line.
3. Fall back to picking the line whose Under/Over rates are closest to **2.00**
each (the "balanced" line — most representative of bookmaker's expectation).
4. As with handicap, capture the full ladder for analysis even if exports only one row.
---
## 3. Period-N Scope Bets
Period markets follow the same pattern as match markets but with a period prefix
in the market token. Examples for `Period-1` (1st half of football, 1st quarter
of basketball, 1st set of tennis):
### 3.1 Period-N Win 1 / Draw / Win 2
> **CORRECTED FROM CAPTURE EVIDENCE (2026-05-05):** Period result markets use
> `RN_H` / `RN_D` / `RN_A` outcome codes (Reduced Numerals: Home / Draw / Away),
> NOT the `1` / `draw` / `3` codes used by `@Match_Result`. Market names also
> vary: football uses `Result_-_1st_Half` (with separator dashes); basketball and
> tennis use `1st_Half_Result0` / `1st_Quarter_Result0` / `1st_Set_Result0`
> (note the literal `0` suffix on the market name — line index for the period
> result market). Phase 3 parser must use these exact tokens.
| Customer field | Football (1st Half) | Basketball (1st Half *or* Quarter) | Tennis (1st Set) | Hockey (1st Period) |
|---|---|---|---|---|
| `Bet_Period-1_Win_1` | `@Result_-_1st_Half.RN_H` | `@1st_Half_Result0.RN_H` (halves) **or** `@1st_Quarter_Result0.RN_H` (quarters) | `@1st_Set_Result0.RN_H` | `@1st_Period_Result0.RN_H` (TBD verify on hockey event) |
| `Bet_Period-1_Draw` | `@Result_-_1st_Half.RN_D` | `@1st_Half_Result0.RN_D` / `@1st_Quarter_Result0.RN_D` | (NULL — no draw) | `@1st_Period_Result0.RN_D` (TBD) |
| `Bet_Period-1_Win_2` | `@Result_-_1st_Half.RN_A` | `@1st_Half_Result0.RN_A` / `@1st_Quarter_Result0.RN_A` | `@1st_Set_Result0.RN_A` | `@1st_Period_Result0.RN_A` (TBD) |
The market token vocabulary differs by sport:
- **Football:** `Result_-__` (e.g., `Result_-_1st_Half`, `Result_-_2nd_Half`).
- **Basketball / Tennis / Hockey:** `__Result0` (e.g.,
`1st_Half_Result0`, `1st_Quarter_Result0`, `1st_Set_Result0`,
`1st_Period_Result0`). The `0` suffix is required.
- **Note:** non-period markets like `@Match_Result.1` and `@Match_Result.draw`
still use the `1`/`draw`/`3` outcome codes — the `RN_*` codes are specific to
period/half/quarter/set markets.
**Period count by sport** (default mapping for `Period-N`):
- Football: N ∈ {1, 2}
- Basketball: configurable — halves (N ∈ {1,2}) or quarters (N ∈ {1,2,3,4}). **Default to halves.**
- Tennis: N ∈ {1, 2, ...} until `th_Set_Result` selection is absent. Cap at 5 for Grand Slams.
- Hockey: N ∈ {1, 2, 3}.
### 3.2 Period-N Win Fora
Same as match handicap, with period prefix:
| Sport | Selection key |
|---|---|
| Football | `@To_Win_1st_Half_With_Handicap{N}.HB_H` / `.HB_A` |
| Basketball | `@To_Win_1st_Half_With_Handicap{N}.HB_*` (or `_1st_Quarter_`) |
| Tennis | `@To_Win_1st_Set_With_Handicap{N}.HB_*` |
| Hockey | `@To_Win_1st_Period_With_Handicap{N}.HB_*` (TBD verify) |
Value extraction: same `.middle-simple` text as match handicap.
### 3.3 Period-N Total Less / More
This is the **least uniform** market. Observed:
| Sport | Period-1 Total selection key |
|---|---|
| Football | (search HTML directly — Phase 3 should parse the "Тотал тайма" tab) Likely `@1st_Half_Total_Goals{N}.Under_` / `.Over_`. |
| Basketball | Per-quarter total exposed as separate market in the "Тоталы" tab; sample event did not show clean `1st_Half_Total_Points` keys — see SCRAPE_FINDINGS.md §6 risk #4. **May need to fall back to NULL** for basketball Period-N Total in some leagues. |
| Tennis | `@1st_Set_Total_Games{N}.Under_` / `.Over_` — confirmed in sample. |
| Hockey | `@1st_Period_Total_Goals...` (TBD verify). |
**Phase 3 robustness rule:** if a period-N market is absent in the parsed HTML,
emit `null` for the corresponding rate/value. Never throw. The Excel exporter
writes empty cell.
---
## 4. Live Counterparts
When the same scope is captured from the **live** site (`/su/live` or live-flagged
events on `/su/`), the spec wants column prefix `Live_*` instead of `Bet_*`.
**Important:** live events use the SAME `data-selection-key` naming conventions.
The distinguishing signal is `data-live="true"` on the outer `div.coupon-row` and
the URL the snapshot was scraped from (`/su/live`).
Examples:
- `Live_Match_Win_1` ← `[data-selection-key$='@Match_Result.1']` from live page
- `Live_Match_Win_Fora_1_Value`, `Live_Match_Win_Fora_1_Rate` ← same DOM, same logic
- `Live_Period-1_Win_1` ← same as `Bet_Period-1_Win_1` but captured from live event
**Implementation:** the parser does not change. The application service simply
records `Source = Live | PreMatch` on each `OddsSnapshot` and the Excel exporter
denormalizes pre-match snapshots to `Bet_*` columns and live snapshots to `Live_*`
columns at write time.
---
## 5. Field Coverage Matrix (spec → confidence)
| Field family | Football | Basketball | Tennis | Hockey | Notes |
|---|---|---|---|---|---|
| `Match_Win_1/2`, `Match_Draw` | ✅ confirmed | ⚠️ Win-1/2 confirmed; Draw conditional on `Normal_Time_Result` presence | ✅ Win-1/2 confirmed; **Draw is null** | ❓ verify Phase 3 | — |
| `Match_Win_Fora_*` | ✅ | ✅ | ✅ (in games) | ❓ | "Main line" heuristic needed (§2.2) |
| `Match_Total_*` | ✅ Goals | ✅ Points | ✅ Games | ❓ | "Main line" heuristic needed (§2.3) |
| `Period-1_Win_*` | ✅ Half | ✅ Half / Quarter | ✅ Set | ❓ Period | basketball mode is configurable |
| `Period-1_Win_Fora_*` | ✅ | ✅ | ✅ | ❓ | — |
| `Period-1_Total_*` | ⚠️ structure verified, exact key TBD | ⚠️ may be absent for some games | ✅ Set | ❓ | risk: emit null where absent |
| `Period-2/3/4_*` | (Period-2 only) | ✅ all | up to actual played sets | ❓ | — |
| `Live_*` (any of above) | same parser | same | same | same | distinguished only by `data-live` flag + scrape URL |
Legend: ✅ confirmed in spike sample, ⚠️ partial / heuristic needed, ❓ Phase 3 must verify.
---
## 6. Suggested Domain Types (Phase 1 input)
```csharp
// Marathon.Domain
public enum BetScope { Match, Period }
public enum BetType { Win, Draw, WinFora, Total }
public enum BetSide { Side1, Side2, Less, More } // Side1=home/W1, Side2=away/W2
public sealed record Sport(int Code, string NameRu, string NameEn);
public sealed record League(int TreeId, string NameRu, int SportCode);
public sealed record Event(
long EventCode, // marathonbet's data-event-eventId
int TreeId, // for URL building
int SportCode,
int LeagueTreeId,
string Country, // breadcrumb position 3
string? Category, // joined breadcrumb 5..N-1
string Team1,
string Team2,
DateTimeOffset ScheduledAt, // anchored on initData.serverTime
string DetailUrl);
public sealed record Bet(
BetScope Scope,
int? PeriodNumber, // null when Scope=Match
BetType Type,
BetSide? Side, // null for Type=Draw
decimal? Value, // handicap/total threshold; null for Win/Draw
decimal Rate); // decimal odds (e.g., 1.65)
public sealed record OddsSnapshot(
long EventCode,
DateTimeOffset CapturedAt,
SnapshotSource Source, // Pre | Live
IReadOnlyList Bets);
public enum SnapshotSource { PreMatch, Live }
```
Phase 1 will refine names, but this captures the data shape Phase 3 produces.
---
## 7. Excel Column Generation (Phase 4 / 9 reference)
The Excel exporter generates wide rows by joining all `Bet`s of an `OddsSnapshot`
into named columns. Pseudocode:
```
foreach snapshot:
row.EventCode = snapshot.EventCode
row.SportCode = event.SportCode
row.Sport = event.Sport.NameRu
row.Country = event.Country
row.League = event.League.NameRu
row.Category = event.Category
row.ScheduledAt = event.ScheduledAt
prefix = snapshot.Source == PreMatch ? "Bet_" : "Live_"
// Match scope
row[prefix+"Match_Win_1"] = bet.Where(scope=Match, type=Win, side=Side1).Rate
row[prefix+"Match_Draw"] = bet.Where(scope=Match, type=Draw).Rate
row[prefix+"Match_Win_2"] = bet.Where(scope=Match, type=Win, side=Side2).Rate
row[prefix+"Match_Win_Fora_1_Value"] = bet.Where(scope=Match, type=WinFora, side=Side1).Value
row[prefix+"Match_Win_Fora_1_Rate"] = bet.Where(scope=Match, type=WinFora, side=Side1).Rate
row[prefix+"Match_Win_Fora_2_Value"] = bet.Where(scope=Match, type=WinFora, side=Side2).Value
row[prefix+"Match_Win_Fora_2_Rate"] = bet.Where(scope=Match, type=WinFora, side=Side2).Rate
row[prefix+"Match_Total_Less_Value"] = bet.Where(scope=Match, type=Total, side=Less).Value
row[prefix+"Match_Total_Less_Rate"] = bet.Where(scope=Match, type=Total, side=Less).Rate
row[prefix+"Match_Total_More_Value"] = bet.Where(scope=Match, type=Total, side=More).Value
row[prefix+"Match_Total_More_Rate"] = bet.Where(scope=Match, type=Total, side=More).Rate
// Period scope (foreach period N exposed for that sport)
for N in 1..MaxPeriodForSport(sportCode):
same fields with key {prefix}Period-{N}_*
null when bet absent
```
Spec column order is left to Phase 4 (`ExcelExporter`). Recommend:
`Date, Time, Sport, Country, League, Category, Event, EventCode,
Bet_Match_*..., Bet_Period-1_*..., Bet_Period-2_*..., Live_Match_*..., Live_Period-N_*...`
---
## 8. Decisions Pending Customer Confirmation
1. **Basketball Period mapping** — halves (default) or quarters? Spec says
"Period-N" but is silent on which N applies. Recommend halves (`N ∈ {1,2}`)
with a quarter mode opt-in via `appsettings.Sports.Basketball.PeriodMode`.
2. **Tennis Draw column** — emit empty / 0 / "—"? Recommend empty cell.
3. **Handicap "main line" rule** — pick the listing's main row, OR the no-suffix
selection, OR the spread closest to bookmaker-implied probability 50/50?
4. **Total "main line" rule** — same as above.
5. **Field name capitalization** — spec uses `Bet_Match_Win_Fora_1_Value` exactly.
Recommend matching exactly (case-sensitive) for compatibility with downstream
pivot tables / scripts.
|