The Pull*UseCase implementations issued one HTTP request at a time despite
Scraping:MaxConcurrentRequests=4. With 30–80 live events and ~1s per
fetch, a 5–10s live cadence target was unreachable; cycles overflowed
the configured interval.
* New Marathon.Application.Configuration.ScrapingThrottle bound from the
shared Scraping:* section. Exposes only MaxConcurrentRequests so the
Application layer doesn't pull in the Infrastructure-side ScrapingOptions.
* PullLiveOddsUseCase + PullUpcomingEventsUseCase split into two phases:
- Phase 1 — Parallel.ForEachAsync over the event list with
MaxDegreeOfParallelism = throttle.MaxConcurrentRequests. The scraper's
Polly rate limiter still throttles to RequestsPerSecond underneath
this fan-out, so spikes are smoothed before they hit the bookmaker.
- Phase 2 — sequential foreach over the (Event, Snapshot) tuples
captured in Phase 1, doing event upsert + snapshot insert. EF Core
DbContext is not thread-safe so all DB writes stay on a single thread.
* InfrastructureModule binds ScrapingThrottle alongside AnomalyOptions.
* Failed snapshot scrapes in Phase 1 mean the event row is also NOT
persisted in Phase 2 — previously we'd persist the row even when the
snapshot scrape failed, leaving an orphan event with no odds. Updated
the regression test accordingly.
* Test fixture exposes TestFixtures.Throttle(maxConcurrentRequests=1) for
deterministic sequential test runs.
* One existing NSubstitute setup that chained Arg.Is<>() across two
configurations was rewritten to use a single Arg.Any<>() with inline
branching — chained matchers were leaking and returning wrong results.
DetectAnomaliesUseCase was issuing one ISnapshotRepository.ListByEventAsync
call per event each cycle, with each call rehydrating that event's bets via
Include(s => s.Bets) — O(N) SQLite round-trips and N Include payloads on
every detection cycle.
* Add ISnapshotRepository.ListByEventsAsync(IReadOnlyCollection<EventId>, …)
returning a per-event dictionary; events with no snapshots in range get
Array.Empty<OddsSnapshot>() so the caller doesn't need a presence check.
* Implementation uses a single .Where(s => ids.Contains(s.EventCode))
query and groups in memory.
* DetectAnomaliesUseCase loads the whole batch once before the foreach,
then ProcessEventAsync receives the per-event slice as a parameter.
* Tests updated to stub the new method; per-event-failure test now
exercises an AddAsync throw rather than a snapshot-load throw, since
individual snapshot loads no longer fail per-event.
Implements Phase 8 Amendment 1: marathonbet.by has no public results archive
endpoint, so results must be harvested per-event by re-fetching the event
detail page until eventJsonInfo.matchIsComplete=true.
Backend changes:
* IOddsScraper:
- ScrapeResultsAsync(DateRange) replaced with ScrapeEventResultAsync(Event)
returning a nullable EventResult — null when match still in progress.
- ScrapeEventOddsAsync now takes the full Event (so EventPath drives URL
construction) instead of bare EventId.
- New ScrapeLiveAsync() for the /su/live listing.
* Domain:
- Event gains EventPath (nullable string) — the data-event-path attribute
captured during scraping; required for reliable URL construction.
* Infrastructure:
- New migration 20260506000000_AddEventPath adds the column.
- EventEntity / EventConfiguration / Mapping / model-snapshot updated.
- MarathonbetScraper: new ScrapeLiveAsync + ScrapeEventResultAsync; URL
builder prefers EventPath, falls back to numeric ID for legacy rows.
- EventListingParserBase extracts data-event-path on every listing row.
* Application:
- PullResultsUseCase: branches on selection vs date-range, emits IProgress<
PullResultsProgress>, returns ResultLoadOutcome (Loaded / AlreadyLoaded /
NotYetComplete / Failed); idempotent (skips events whose result already
exists).
- PullLiveOddsUseCase now drives off the live listing (auto-discovers
events that go live without ever appearing in the upcoming list) and
backfills EventPath on legacy rows.
- PullUpcomingEventsUseCase wires EventPath on persisted events.
* Workers: UpcomingEventsPoller updates persistence path accordingly.
* Tests: 17 net-new tests across Application + Infrastructure + Domain;
all 293 still pass.
- AnomalyDetector (pure domain): detects odds-flip pattern from live snapshot
timelines using implied-probability vectors (p=1/rate, normalised), flip score
= max(|p_post−p_pre|), gated by both threshold AND favourite-changed test
- SuspensionInterval record: typed pair of (pre, post) OddsSnapshot bracketing a gap
- AnomalyOptions POCO (Application layer): bound to Anomaly:* config section with
four fields (SuspensionGapSeconds=60, OddsFlipThreshold=0.30, MinSnapshotCount=3,
DetectionIntervalSeconds=60)
- DetectAnomaliesUseCase: iterates all events, loads last-24h live snapshots, runs
detector, persists new anomalies with 1-minute dedup window
- AnomalyDetectionPoller: BackgroundService polling every DetectionIntervalSeconds,
gated by WorkerOptions.AnomalyDetectionEnabled (default true)
- DI wiring: DetectAnomaliesUseCase registered Scoped in ApplicationModule;
AnomalyOptions bound + AnomalyDetectionPoller hosted in InfrastructureModule
- WorkerOptions.AnomalyDetectionEnabled added; appsettings.json updated
- 13 domain tests + 4 application tests; total 245/245 passing (no regression)