Commit Graph

6 Commits

Author SHA1 Message Date
alexei.dolgolyov 286b55986b perf(scraping): parallel HTTP fan-out, sequential DB persist (HIGH)
The Pull*UseCase implementations issued one HTTP request at a time despite
Scraping:MaxConcurrentRequests=4. With 30–80 live events and ~1s per
fetch, a 5–10s live cadence target was unreachable; cycles overflowed
the configured interval.

* New Marathon.Application.Configuration.ScrapingThrottle bound from the
  shared Scraping:* section. Exposes only MaxConcurrentRequests so the
  Application layer doesn't pull in the Infrastructure-side ScrapingOptions.
* PullLiveOddsUseCase + PullUpcomingEventsUseCase split into two phases:
  - Phase 1 — Parallel.ForEachAsync over the event list with
    MaxDegreeOfParallelism = throttle.MaxConcurrentRequests. The scraper's
    Polly rate limiter still throttles to RequestsPerSecond underneath
    this fan-out, so spikes are smoothed before they hit the bookmaker.
  - Phase 2 — sequential foreach over the (Event, Snapshot) tuples
    captured in Phase 1, doing event upsert + snapshot insert. EF Core
    DbContext is not thread-safe so all DB writes stay on a single thread.
* InfrastructureModule binds ScrapingThrottle alongside AnomalyOptions.
* Failed snapshot scrapes in Phase 1 mean the event row is also NOT
  persisted in Phase 2 — previously we'd persist the row even when the
  snapshot scrape failed, leaving an orphan event with no odds. Updated
  the regression test accordingly.
* Test fixture exposes TestFixtures.Throttle(maxConcurrentRequests=1) for
  deterministic sequential test runs.
* One existing NSubstitute setup that chained Arg.Is<>() across two
  configurations was rewritten to use a single Arg.Any<>() with inline
  branching — chained matchers were leaking and returning wrong results.
2026-05-09 15:27:06 +03:00
alexei.dolgolyov 66ae038243 perf(detect-anomalies): batch snapshot loads into a single query (HIGH)
DetectAnomaliesUseCase was issuing one ISnapshotRepository.ListByEventAsync
call per event each cycle, with each call rehydrating that event's bets via
Include(s => s.Bets) — O(N) SQLite round-trips and N Include payloads on
every detection cycle.

* Add ISnapshotRepository.ListByEventsAsync(IReadOnlyCollection<EventId>, …)
  returning a per-event dictionary; events with no snapshots in range get
  Array.Empty<OddsSnapshot>() so the caller doesn't need a presence check.
* Implementation uses a single .Where(s => ids.Contains(s.EventCode))
  query and groups in memory.
* DetectAnomaliesUseCase loads the whole batch once before the foreach,
  then ProcessEventAsync receives the per-event slice as a parameter.
* Tests updated to stub the new method; per-event-failure test now
  exercises an AddAsync throw rather than a snapshot-load throw, since
  individual snapshot loads no longer fail per-event.
2026-05-09 15:17:49 +03:00
alexei.dolgolyov 9c5d3df1f2 feat(phase-8-backend): per-event results harvesting + EventPath plumbing
Implements Phase 8 Amendment 1: marathonbet.by has no public results archive
endpoint, so results must be harvested per-event by re-fetching the event
detail page until eventJsonInfo.matchIsComplete=true.

Backend changes:

* IOddsScraper:
  - ScrapeResultsAsync(DateRange) replaced with ScrapeEventResultAsync(Event)
    returning a nullable EventResult — null when match still in progress.
  - ScrapeEventOddsAsync now takes the full Event (so EventPath drives URL
    construction) instead of bare EventId.
  - New ScrapeLiveAsync() for the /su/live listing.

* Domain:
  - Event gains EventPath (nullable string) — the data-event-path attribute
    captured during scraping; required for reliable URL construction.

* Infrastructure:
  - New migration 20260506000000_AddEventPath adds the column.
  - EventEntity / EventConfiguration / Mapping / model-snapshot updated.
  - MarathonbetScraper: new ScrapeLiveAsync + ScrapeEventResultAsync; URL
    builder prefers EventPath, falls back to numeric ID for legacy rows.
  - EventListingParserBase extracts data-event-path on every listing row.

* Application:
  - PullResultsUseCase: branches on selection vs date-range, emits IProgress<
    PullResultsProgress>, returns ResultLoadOutcome (Loaded / AlreadyLoaded /
    NotYetComplete / Failed); idempotent (skips events whose result already
    exists).
  - PullLiveOddsUseCase now drives off the live listing (auto-discovers
    events that go live without ever appearing in the upcoming list) and
    backfills EventPath on legacy rows.
  - PullUpcomingEventsUseCase wires EventPath on persisted events.

* Workers: UpcomingEventsPoller updates persistence path accordingly.

* Tests: 17 net-new tests across Application + Infrastructure + Domain;
  all 293 still pass.
2026-05-09 15:10:27 +03:00
alexei.dolgolyov a6ff368015 feat(phase-7-backend): implement anomaly detection — SuspensionFlip detector, use case, poller, and tests
- AnomalyDetector (pure domain): detects odds-flip pattern from live snapshot
  timelines using implied-probability vectors (p=1/rate, normalised), flip score
  = max(|p_post−p_pre|), gated by both threshold AND favourite-changed test
- SuspensionInterval record: typed pair of (pre, post) OddsSnapshot bracketing a gap
- AnomalyOptions POCO (Application layer): bound to Anomaly:* config section with
  four fields (SuspensionGapSeconds=60, OddsFlipThreshold=0.30, MinSnapshotCount=3,
  DetectionIntervalSeconds=60)
- DetectAnomaliesUseCase: iterates all events, loads last-24h live snapshots, runs
  detector, persists new anomalies with 1-minute dedup window
- AnomalyDetectionPoller: BackgroundService polling every DetectionIntervalSeconds,
  gated by WorkerOptions.AnomalyDetectionEnabled (default true)
- DI wiring: DetectAnomaliesUseCase registered Scoped in ApplicationModule;
  AnomalyOptions bound + AnomalyDetectionPoller hosted in InfrastructureModule
- WorkerOptions.AnomalyDetectionEnabled added; appsettings.json updated
- 13 domain tests + 4 application tests; total 245/245 passing (no regression)
2026-05-05 13:15:50 +03:00
alexei.dolgolyov 2acbaa5b77 feat(phase-4): application layer + background workers — 202/202 tests green
Use cases (Marathon.Application/UseCases/):
- PullUpcomingEventsUseCase: scrape + persist new events + capture pre-match snapshots
- PullLiveOddsUseCase: refresh live snapshots for all stored events
- PullResultsUseCase: Phase 4 scaffold; delegates to ScrapeResultsAsync (Phase 3 no-op);
  Phase 8 will replace with watch-list polling
- ExportToExcelUseCase: resolves export dir from StorageOptions, delegates to IExcelExporter

ApplicationModule.AddMarathonApplication(IServiceCollection) — no IConfiguration needed.

Background workers (Marathon.Infrastructure/Workers/):
- UpcomingEventsPoller: Cronos 6-field cron schedule (default every 6 h)
- LiveOddsPoller: fixed interval (WorkerOptions.LivePollIntervalSeconds, default 30 s)
- ResultsWatchListPoller: scaffold, disabled by default (WorkerOptions.ResultsPollerEnabled=false)
All three: exception-swallowing, cancellation-aware, scoped DI via CreateAsyncScope().

InfrastructureModule.AddMarathonInfrastructure(IServiceCollection, IConfiguration):
- Composes AddMarathonPersistence + AddMarathonScraping + WorkerOptions + 3 hosted services

App.xaml.cs: replace reflection-based TryAddApplicationAndInfrastructure with direct
AddMarathonApplication() + AddMarathonInfrastructure(config) calls.

Resolved Phase 3 TODO: bind Sports:Basketball:QuarterMode from config in ScrapingModule.

appsettings.json: add Workers.LivePollIntervalSeconds, ResultsPollIntervalSeconds,
ResultsPollerEnabled; add Sports.Basketball.QuarterMode.

Settings.razor + WorkerOptions (UI) + SharedResource.*.resx: surface new Workers fields.

Tests: +14 Application use-case tests, +3 Infrastructure worker tests (185 → 202 total).
2026-05-05 12:28:15 +03:00
alexei.dolgolyov 61114ea31b feat: implement Phase 1 — solution skeleton and domain model
Creates the 9-project .NET 8 solution (5 src + 4 test) with Marathon.Domain
fully implemented: value objects (SportCode, EventId, OddsRate, OddsValue,
BetScope hierarchy), enums (Side, BetType, OddsSource, AnomalyKind), and
entities (Sport, Country, League, Event, Bet, OddsSnapshot, EventResult,
Anomaly) with all invariants enforced in constructors. 96 domain tests pass
(FluentAssertions + xUnit). Directory.Build.props and Directory.Packages.props
centralise build settings and NuGet versions. Both Marathon.sln and Marathon.slnx
are committed; dotnet build Marathon.sln succeeds with 0 warnings/errors.
2026-05-05 01:20:28 +03:00