Files
maraphon-app/CLAUDE.md
T
alexei.dolgolyov 2acbaa5b77 feat(phase-4): application layer + background workers — 202/202 tests green
Use cases (Marathon.Application/UseCases/):
- PullUpcomingEventsUseCase: scrape + persist new events + capture pre-match snapshots
- PullLiveOddsUseCase: refresh live snapshots for all stored events
- PullResultsUseCase: Phase 4 scaffold; delegates to ScrapeResultsAsync (Phase 3 no-op);
  Phase 8 will replace with watch-list polling
- ExportToExcelUseCase: resolves export dir from StorageOptions, delegates to IExcelExporter

ApplicationModule.AddMarathonApplication(IServiceCollection) — no IConfiguration needed.

Background workers (Marathon.Infrastructure/Workers/):
- UpcomingEventsPoller: Cronos 6-field cron schedule (default every 6 h)
- LiveOddsPoller: fixed interval (WorkerOptions.LivePollIntervalSeconds, default 30 s)
- ResultsWatchListPoller: scaffold, disabled by default (WorkerOptions.ResultsPollerEnabled=false)
All three: exception-swallowing, cancellation-aware, scoped DI via CreateAsyncScope().

InfrastructureModule.AddMarathonInfrastructure(IServiceCollection, IConfiguration):
- Composes AddMarathonPersistence + AddMarathonScraping + WorkerOptions + 3 hosted services

App.xaml.cs: replace reflection-based TryAddApplicationAndInfrastructure with direct
AddMarathonApplication() + AddMarathonInfrastructure(config) calls.

Resolved Phase 3 TODO: bind Sports:Basketball:QuarterMode from config in ScrapingModule.

appsettings.json: add Workers.LivePollIntervalSeconds, ResultsPollIntervalSeconds,
ResultsPollerEnabled; add Sports.Basketball.QuarterMode.

Settings.razor + WorkerOptions (UI) + SharedResource.*.resx: surface new Workers fields.

Tests: +14 Application use-case tests, +3 Infrastructure worker tests (185 → 202 total).
2026-05-05 12:28:15 +03:00

8.7 KiB
Raw Blame History

CLAUDE.md — maraphon-app

Project memory for Claude Code sessions on this repository. Keep entries concise. Per-feature learnings are appended below by the feature-planner workflow.

Project Overview

maraphon-app is a sports betting odds analyzer for marathonbet.by. It scrapes pre-match (/su) and live (/su/live) events, persists odds snapshots over time, and detects anomalies — especially the odds-flip pattern (bookmaker freezes bets then inverts underdog/favorite coefficients).

Architecture (Clean Architecture, 5 projects + tests)

Marathon.Domain          ← entities, value objects, no external deps
   ↑
Marathon.Application     ← use cases + abstractions (IOddsScraper, IRepository, ...)
   ↑
Marathon.Infrastructure  ← EF Core (SQLite), scraping (AngleSharp/Playwright), Excel, Polly
Marathon.UI              ← Razor Class Library (all Blazor components — host-agnostic)
   ↑
Marathon.Hosts.WpfBlazor ← WPF + BlazorWebView host (replaceable for ASP.NET Core later)

Key portability invariant: All UI lives in Marathon.UI (Razor Class Library). The host project (Marathon.Hosts.WpfBlazor) is the only thing that changes when migrating to a web app — drop in an ASP.NET Core Blazor Server host that references the same RCL.

Tech stack

  • .NET 8 LTS, C# 12
  • EF Core 8 + SQLite (WAL mode)
  • AngleSharp (HTML), Playwright for .NET (SPA fallback)
  • Polly v8 (Microsoft.Extensions.Http.Resilience)
  • MudBlazor components, Plotly.Blazor charts
  • Serilog logging (rolling file + console)
  • xUnit + FluentAssertions + NSubstitute, in-memory SQLite for repo tests

Build & test

Command Purpose
dotnet build Marathon.sln Build all projects
dotnet test Marathon.sln Run all tests
dotnet format Marathon.sln --verify-no-changes Lint
dotnet run --project src/Marathon.Hosts.WpfBlazor Run desktop app

Coding conventions

  • Nullable reference types: enabled (<Nullable>enable</Nullable>)
  • Implicit usings: enabled
  • Treat warnings as errors in Release builds
  • File-scoped namespaces
  • One public type per file (except small DTOs/records grouped in a feature folder)
  • Domain entities: prefer record for immutable data; class with private setters when identity matters
  • No mutation of domain objects after construction — return new instances
  • Repositories return IReadOnlyList<T>, not List<T> or IEnumerable<T> (clarity on enumeration cost)
  • Tests follow Given_When_Then or Should_<expected>_When_<condition> naming

Configuration

Every variable parameter is configurable via appsettings.json and overridable via appsettings.Local.json (gitignored) or environment variables:

  • Scraping:PollingIntervalSeconds (default 30)
  • Scraping:MaxConcurrentRequests (default 4)
  • Scraping:UserAgents[] (rotated per request)
  • Scraping:RetryPolicy:* (Polly settings)
  • Scraping:RateLimit:RequestsPerSecond (default 1)
  • Storage:DatabasePath (default ./data/marathon.db)
  • Storage:ExportDirectory (default ./exports)
  • Storage:SnapshotRetentionDays (default 90)
  • Anomaly:SuspensionGapSeconds (default 60)
  • Anomaly:OddsFlipThreshold (default 0.30 — implied probability delta)
  • Localization:DefaultCulture (default ru-RU)

A future Settings page in the UI binds to these.

Domain model summary

  • Sport(Code, Name) — e.g., (6, "Баскетбол")
  • Event(Id, SportCode, CountryCode, LeagueId, CategoryId, ScheduledAt, EventCodeFromBookmaker)
  • OddsSnapshot(EventId, CapturedAt, Source: Pre|Live, Bets: List<Bet>)
  • Bet(Scope: Match|Period[N], Type: Win|Draw|WinFora|Total, Side: 1|2|Less|More, Value?, Rate)
  • EventResult(EventId, FinalScore, WinnerSide)
  • Anomaly(EventId, DetectedAt, Kind: SuspensionFlip, Score, EvidenceTimeline)

Excel export schema (compliance with customer spec)

Customer TZ requires wide-table layout with columns like Bet_Match_Win_1, Bet_Period-1_Win_Fora_2_Value, etc.

Internal storage is normalized (one row per Bet in OddsSnapshots). The Excel exporter denormalizes to the wide format on demand. Filename pattern:

Marathon_<YYYY-MM-DD>_to_<YYYY-MM-DD>.xlsx

Recurring Issues & Patterns

  • dotnet new sln on .NET 10 SDK produces .slnx, not .sln. If the plan references Marathon.sln, hand-craft the traditional format alongside .slnx.
  • Marathon.Application namespace vs System.Windows.Application: in any WPF project that references Marathon.Application, always write System.Windows.Application fully qualified in App.xaml.cs.
  • Directory.Build.props must NOT set TargetFramework when projects in the same solution use different TFMs (e.g., net8.0 vs net8.0-windows).

Feature: Initial Implementation > Phase 4: Application + Workers — Learnings

  • Two WorkerOptions classes coexist with the same JSON shape but different namespaces: Marathon.Infrastructure.Configuration.WorkerOptions (immutable init, used by workers) and Marathon.UI.Services.WorkerOptions (mutable set, used by Settings page). Both bind to "Workers" in appsettings.json. Keep them in sync when adding new keys.
  • Microsoft.Extensions.Logging.EventId conflicts with Marathon.Domain.ValueObjects.EventId in any project that adds Microsoft.Extensions.Logging.Abstractions. Fix with a global alias in GlobalUsings.cs: global using LogEventId = Microsoft.Extensions.Logging.EventId; and local file aliases where both are used together.
  • NSubstitute cannot proxy sealed classes. Use cases are sealed record or sealed class. Worker tests must build a real use-case instance backed by substituted interfaces rather than substituting the use case directly.
  • BackgroundService workers are singletons; use cases are scoped. Always resolve scoped use cases via IServiceProvider.CreateAsyncScope() inside the worker loop — never inject them directly into the constructor.
  • Cronos 6-field cron format. Pass CronFormat.IncludeSeconds to CronExpression.Parse when the expression has a seconds field (e.g., "0 0 */6 * * *"). Default Cronos parse expects 5-field (no seconds).
  • ApplicationModule.AddMarathonApplication takes no IConfiguration — the Application layer has no config bindings of its own. Infrastructure and UI bind their own options sections.

Feature: Initial Implementation > Phase 0: Scraping Spike — Learnings

(Permanent learnings about marathonbet.by data shape, anti-bot, page structure. For full detail see spike/SCRAPE_FINDINGS.md and spike/SCHEMA_DRAFT.md.)

  • Site is fully SSR (Server: nginx). Anonymous GET with browser User-Agent returns full HTML for /su/, /su/live, /su/popular/<Sport>, /su/betting/<event-path>. No Cloudflare, no JS challenge.
  • Use HttpClient + AngleSharp + Polly v8 — no Playwright needed for read-only. Keep Scraping:UsePlaywright = false flag for future-proofing.
  • Sport ID = data-sport-treeId = breadcrumb canonical ID. Confirmed: Basketball=6, Football=11, Tennis=22723, Hockey=43658. URL by ID: /su/betting/<Sport>+-+<id> (preferred over /su/popular/<Sport> because the ID is stable).
  • EventCode = data-event-eventId (numeric, ~26-million range, stable). TreeId = data-event-treeId (URL-routing ID, less stable). Use EventCode as the entity primary key in SQLite.
  • Selection key format: {eventId}@{MarketName}{LineIndex?}.{Outcome}. Outcomes: 1/draw/3 for 3-way, HB_H/HB_A for handicap, Under_<X>/ Over_<X> for totals. Total threshold is encoded in the outcome string; handicap value lives in <span class="middle-simple"> text.
  • Tennis has no Draw outcome. Domain Bet_Match_Draw must be nullable; Excel exporter writes empty cell when null.
  • Date parsing: listing shows HH:MM (today) or DD <ru-month> HH:MM (future). Anchor with initData.serverTime (Moscow TZ, format YYYY,MM,DD,HH,MM,SS) parsed from the embedded <script> blob on every scraped page.
  • Live updates: site polls /su/liveupdate/popular/?treeIds=... every 3 s but response is just {"modified":[{"type":"refreshPage"}],...} — re-scrape the full event detail HTML for actual odds. Our analyzer cadence: pre-match 30 s, live 510 s.
  • No public results / archive page (/su/results → 404). Final scores must be harvested by polling the event detail page until eventJsonInfo.matchIsComplete=true, then storing resultDescription. Phase 8 cannot back-fill from a public archive.
  • Period scope vocabulary varies by sport: football=1st_Half, basketball= 1st_Half/1st_Quarter, tennis=1st_Set, hockey=1st_Period. Domain stores PeriodNumber:int and a sport-aware PeriodScopeMapper resolves the correct market token at parse time.