Files
maraphon-app/CLAUDE.md
T
alexei.dolgolyov 2acbaa5b77 feat(phase-4): application layer + background workers — 202/202 tests green
Use cases (Marathon.Application/UseCases/):
- PullUpcomingEventsUseCase: scrape + persist new events + capture pre-match snapshots
- PullLiveOddsUseCase: refresh live snapshots for all stored events
- PullResultsUseCase: Phase 4 scaffold; delegates to ScrapeResultsAsync (Phase 3 no-op);
  Phase 8 will replace with watch-list polling
- ExportToExcelUseCase: resolves export dir from StorageOptions, delegates to IExcelExporter

ApplicationModule.AddMarathonApplication(IServiceCollection) — no IConfiguration needed.

Background workers (Marathon.Infrastructure/Workers/):
- UpcomingEventsPoller: Cronos 6-field cron schedule (default every 6 h)
- LiveOddsPoller: fixed interval (WorkerOptions.LivePollIntervalSeconds, default 30 s)
- ResultsWatchListPoller: scaffold, disabled by default (WorkerOptions.ResultsPollerEnabled=false)
All three: exception-swallowing, cancellation-aware, scoped DI via CreateAsyncScope().

InfrastructureModule.AddMarathonInfrastructure(IServiceCollection, IConfiguration):
- Composes AddMarathonPersistence + AddMarathonScraping + WorkerOptions + 3 hosted services

App.xaml.cs: replace reflection-based TryAddApplicationAndInfrastructure with direct
AddMarathonApplication() + AddMarathonInfrastructure(config) calls.

Resolved Phase 3 TODO: bind Sports:Basketball:QuarterMode from config in ScrapingModule.

appsettings.json: add Workers.LivePollIntervalSeconds, ResultsPollIntervalSeconds,
ResultsPollerEnabled; add Sports.Basketball.QuarterMode.

Settings.razor + WorkerOptions (UI) + SharedResource.*.resx: surface new Workers fields.

Tests: +14 Application use-case tests, +3 Infrastructure worker tests (185 → 202 total).
2026-05-05 12:28:15 +03:00

173 lines
8.7 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# CLAUDE.md — maraphon-app
> Project memory for Claude Code sessions on this repository. Keep entries concise.
> Per-feature learnings are appended below by the feature-planner workflow.
## Project Overview
**maraphon-app** is a sports betting odds analyzer for marathonbet.by. It scrapes
pre-match (`/su`) and live (`/su/live`) events, persists odds snapshots over time, and
detects anomalies — especially the **odds-flip** pattern (bookmaker freezes bets then
inverts underdog/favorite coefficients).
## Architecture (Clean Architecture, 5 projects + tests)
```
Marathon.Domain ← entities, value objects, no external deps
Marathon.Application ← use cases + abstractions (IOddsScraper, IRepository, ...)
Marathon.Infrastructure ← EF Core (SQLite), scraping (AngleSharp/Playwright), Excel, Polly
Marathon.UI ← Razor Class Library (all Blazor components — host-agnostic)
Marathon.Hosts.WpfBlazor ← WPF + BlazorWebView host (replaceable for ASP.NET Core later)
```
**Key portability invariant:** All UI lives in `Marathon.UI` (Razor Class Library). The
host project (`Marathon.Hosts.WpfBlazor`) is the *only* thing that changes when migrating
to a web app — drop in an ASP.NET Core Blazor Server host that references the same RCL.
## Tech stack
- **.NET 8 LTS**, C# 12
- **EF Core 8** + SQLite (WAL mode)
- **AngleSharp** (HTML), **Playwright for .NET** (SPA fallback)
- **Polly v8** (`Microsoft.Extensions.Http.Resilience`)
- **MudBlazor** components, **Plotly.Blazor** charts
- **Serilog** logging (rolling file + console)
- **xUnit + FluentAssertions + NSubstitute**, in-memory SQLite for repo tests
## Build & test
| Command | Purpose |
|---|---|
| `dotnet build Marathon.sln` | Build all projects |
| `dotnet test Marathon.sln` | Run all tests |
| `dotnet format Marathon.sln --verify-no-changes` | Lint |
| `dotnet run --project src/Marathon.Hosts.WpfBlazor` | Run desktop app |
## Coding conventions
- Nullable reference types: **enabled** (`<Nullable>enable</Nullable>`)
- Implicit usings: enabled
- Treat warnings as errors in `Release` builds
- File-scoped namespaces
- One public type per file (except small DTOs/records grouped in a feature folder)
- Domain entities: prefer `record` for immutable data; class with private setters when
identity matters
- No mutation of domain objects after construction — return new instances
- Repositories return `IReadOnlyList<T>`, not `List<T>` or `IEnumerable<T>` (clarity on
enumeration cost)
- Tests follow `Given_When_Then` or `Should_<expected>_When_<condition>` naming
## Configuration
Every variable parameter is configurable via `appsettings.json` and overridable via
`appsettings.Local.json` (gitignored) or environment variables:
- `Scraping:PollingIntervalSeconds` (default 30)
- `Scraping:MaxConcurrentRequests` (default 4)
- `Scraping:UserAgents[]` (rotated per request)
- `Scraping:RetryPolicy:*` (Polly settings)
- `Scraping:RateLimit:RequestsPerSecond` (default 1)
- `Storage:DatabasePath` (default `./data/marathon.db`)
- `Storage:ExportDirectory` (default `./exports`)
- `Storage:SnapshotRetentionDays` (default 90)
- `Anomaly:SuspensionGapSeconds` (default 60)
- `Anomaly:OddsFlipThreshold` (default 0.30 — implied probability delta)
- `Localization:DefaultCulture` (default `ru-RU`)
A future Settings page in the UI binds to these.
## Domain model summary
- `Sport(Code, Name)` — e.g., `(6, "Баскетбол")`
- `Event(Id, SportCode, CountryCode, LeagueId, CategoryId, ScheduledAt, EventCodeFromBookmaker)`
- `OddsSnapshot(EventId, CapturedAt, Source: Pre|Live, Bets: List<Bet>)`
- `Bet(Scope: Match|Period[N], Type: Win|Draw|WinFora|Total, Side: 1|2|Less|More, Value?, Rate)`
- `EventResult(EventId, FinalScore, WinnerSide)`
- `Anomaly(EventId, DetectedAt, Kind: SuspensionFlip, Score, EvidenceTimeline)`
## Excel export schema (compliance with customer spec)
Customer TZ requires wide-table layout with columns like `Bet_Match_Win_1`,
`Bet_Period-1_Win_Fora_2_Value`, etc.
**Internal storage is normalized** (one row per Bet in `OddsSnapshots`). The Excel
exporter denormalizes to the wide format on demand. Filename pattern:
```
Marathon_<YYYY-MM-DD>_to_<YYYY-MM-DD>.xlsx
```
## Recurring Issues & Patterns
- **`dotnet new sln` on .NET 10 SDK produces `.slnx`**, not `.sln`. If the plan
references `Marathon.sln`, hand-craft the traditional format alongside `.slnx`.
- **`Marathon.Application` namespace vs `System.Windows.Application`:** in any WPF
project that references `Marathon.Application`, always write
`System.Windows.Application` fully qualified in `App.xaml.cs`.
- **`Directory.Build.props` must NOT set `TargetFramework`** when projects in the
same solution use different TFMs (e.g., `net8.0` vs `net8.0-windows`).
## Feature: Initial Implementation > Phase 4: Application + Workers — Learnings
- **Two `WorkerOptions` classes coexist** with the same JSON shape but different namespaces:
`Marathon.Infrastructure.Configuration.WorkerOptions` (immutable `init`, used by workers)
and `Marathon.UI.Services.WorkerOptions` (mutable `set`, used by Settings page).
Both bind to `"Workers"` in `appsettings.json`. Keep them in sync when adding new keys.
- **`Microsoft.Extensions.Logging.EventId` conflicts with `Marathon.Domain.ValueObjects.EventId`**
in any project that adds `Microsoft.Extensions.Logging.Abstractions`. Fix with a global alias
in `GlobalUsings.cs`: `global using LogEventId = Microsoft.Extensions.Logging.EventId;`
and local file aliases where both are used together.
- **NSubstitute cannot proxy `sealed` classes.** Use cases are `sealed record` or `sealed class`.
Worker tests must build a real use-case instance backed by substituted interfaces rather than
substituting the use case directly.
- **`BackgroundService` workers are singletons; use cases are scoped.** Always resolve scoped
use cases via `IServiceProvider.CreateAsyncScope()` inside the worker loop — never inject them
directly into the constructor.
- **Cronos 6-field cron format.** Pass `CronFormat.IncludeSeconds` to `CronExpression.Parse`
when the expression has a seconds field (e.g., `"0 0 */6 * * *"`). Default Cronos parse
expects 5-field (no seconds).
- **`ApplicationModule.AddMarathonApplication` takes no `IConfiguration`** — the Application
layer has no config bindings of its own. Infrastructure and UI bind their own options sections.
## Feature: Initial Implementation > Phase 0: Scraping Spike — Learnings
(Permanent learnings about marathonbet.by data shape, anti-bot, page structure.
For full detail see `spike/SCRAPE_FINDINGS.md` and `spike/SCHEMA_DRAFT.md`.)
- **Site is fully SSR (`Server: nginx`).** Anonymous GET with browser User-Agent
returns full HTML for `/su/`, `/su/live`, `/su/popular/<Sport>`,
`/su/betting/<event-path>`. No Cloudflare, no JS challenge.
- **Use HttpClient + AngleSharp + Polly v8** — no Playwright needed for read-only.
Keep `Scraping:UsePlaywright = false` flag for future-proofing.
- **Sport ID = `data-sport-treeId` = breadcrumb canonical ID.** Confirmed:
Basketball=6, Football=11, Tennis=22723, Hockey=43658. URL by ID:
`/su/betting/<Sport>+-+<id>` (preferred over `/su/popular/<Sport>` because the
ID is stable).
- **`EventCode` = `data-event-eventId`** (numeric, ~26-million range, stable).
`TreeId` = `data-event-treeId` (URL-routing ID, less stable). Use `EventCode`
as the entity primary key in SQLite.
- **Selection key format:** `{eventId}@{MarketName}{LineIndex?}.{Outcome}`.
Outcomes: `1`/`draw`/`3` for 3-way, `HB_H`/`HB_A` for handicap, `Under_<X>`/
`Over_<X>` for totals. Total threshold is encoded in the outcome string;
handicap value lives in `<span class="middle-simple">` text.
- **Tennis has no Draw outcome.** Domain `Bet_Match_Draw` must be nullable; Excel
exporter writes empty cell when null.
- **Date parsing:** listing shows `HH:MM` (today) or `DD <ru-month> HH:MM` (future).
Anchor with `initData.serverTime` (Moscow TZ, format `YYYY,MM,DD,HH,MM,SS`)
parsed from the embedded `<script>` blob on every scraped page.
- **Live updates:** site polls `/su/liveupdate/popular/?treeIds=...` every 3 s but
response is just `{"modified":[{"type":"refreshPage"}],...}` — re-scrape the
full event detail HTML for actual odds. Our analyzer cadence: pre-match 30 s,
live 510 s.
- **No public results / archive page** (`/su/results` → 404). Final scores must
be harvested by polling the event detail page until
`eventJsonInfo.matchIsComplete=true`, then storing `resultDescription`. Phase 8
cannot back-fill from a public archive.
- **Period scope vocabulary varies by sport:** football=`1st_Half`, basketball=
`1st_Half`/`1st_Quarter`, tennis=`1st_Set`, hockey=`1st_Period`. Domain stores
`PeriodNumber:int` and a sport-aware `PeriodScopeMapper` resolves the correct
market token at parse time.