perf(scraping): parallel HTTP fan-out, sequential DB persist (HIGH)

The Pull*UseCase implementations issued one HTTP request at a time despite
Scraping:MaxConcurrentRequests=4. With 30–80 live events and ~1s per
fetch, a 5–10s live cadence target was unreachable; cycles overflowed
the configured interval.

* New Marathon.Application.Configuration.ScrapingThrottle bound from the
  shared Scraping:* section. Exposes only MaxConcurrentRequests so the
  Application layer doesn't pull in the Infrastructure-side ScrapingOptions.
* PullLiveOddsUseCase + PullUpcomingEventsUseCase split into two phases:
  - Phase 1 — Parallel.ForEachAsync over the event list with
    MaxDegreeOfParallelism = throttle.MaxConcurrentRequests. The scraper's
    Polly rate limiter still throttles to RequestsPerSecond underneath
    this fan-out, so spikes are smoothed before they hit the bookmaker.
  - Phase 2 — sequential foreach over the (Event, Snapshot) tuples
    captured in Phase 1, doing event upsert + snapshot insert. EF Core
    DbContext is not thread-safe so all DB writes stay on a single thread.
* InfrastructureModule binds ScrapingThrottle alongside AnomalyOptions.
* Failed snapshot scrapes in Phase 1 mean the event row is also NOT
  persisted in Phase 2 — previously we'd persist the row even when the
  snapshot scrape failed, leaving an orphan event with no odds. Updated
  the regression test accordingly.
* Test fixture exposes TestFixtures.Throttle(maxConcurrentRequests=1) for
  deterministic sequential test runs.
* One existing NSubstitute setup that chained Arg.Is<>() across two
  configurations was rewritten to use a single Arg.Any<>() with inline
  branching — chained matchers were leaking and returning wrong results.
This commit is contained in:
2026-05-09 15:27:06 +03:00
parent 66ae038243
commit 286b55986b
8 changed files with 177 additions and 53 deletions
@@ -1,6 +1,8 @@
using Marathon.Application.Configuration;
using Marathon.Domain.Entities;
using Marathon.Domain.Enums;
using Marathon.Domain.ValueObjects;
using Microsoft.Extensions.Options;
namespace Marathon.Application.Tests.UseCases;
@@ -42,4 +44,23 @@ internal static class TestFixtures
{
return new EventResult(eventId, 2, 1, Side.Side1, DateTimeOffset.UtcNow);
}
/// <summary>
/// Creates an <see cref="IOptionsMonitor{TOptions}"/> that always returns the given
/// throttle. Use 1 for sequential test behaviour, higher values to exercise fan-out.
/// </summary>
public static IOptionsMonitor<ScrapingThrottle> Throttle(int maxConcurrentRequests = 1) =>
new StaticOptionsMonitor<ScrapingThrottle>(new ScrapingThrottle
{
MaxConcurrentRequests = maxConcurrentRequests,
});
private sealed class StaticOptionsMonitor<T> : IOptionsMonitor<T> where T : class
{
private readonly T _value;
public StaticOptionsMonitor(T value) => _value = value;
public T CurrentValue => _value;
public T Get(string? name) => _value;
public IDisposable? OnChange(Action<T, string?> listener) => null;
}
}