# Phase 2: Infrastructure — Storage **Status:** ✅ Done **Parent plan:** [PLAN.md](./PLAN.md) **Domain:** backend ## Objective Implement persistent storage: EF Core + SQLite (WAL) with migrations, repository implementations of the Application layer's interfaces, and a ClosedXML-based Excel exporter that produces files matching the customer's wide-column spec with date-range filenames. ## Tasks - [x] Add packages to `Marathon.Infrastructure` (via `Directory.Packages.props`): - `Microsoft.EntityFrameworkCore` - `Microsoft.EntityFrameworkCore.Sqlite` - `Microsoft.EntityFrameworkCore.Design` - `ClosedXML` - Also added `AngleSharp`, `Polly`, `Microsoft.Extensions.Http.Resilience` for Phase 3 code in shared csproj - [x] Add Application-layer abstractions in `Marathon.Application/Abstractions/`: - `IRepository` — generic CRUD: `GetAsync`, `ListAsync`, `AddAsync`, `UpdateAsync`, `DeleteAsync`, `SaveChangesAsync` - `IEventRepository : IRepository` — adds `ListByDateRangeAsync`, `ListBySportAsync` - `ISnapshotRepository : IRepository` — adds `ListByEventAsync(EventId, DateTimeOffset from, DateTimeOffset to)` - `IResultRepository : IRepository` - `IAnomalyRepository : IRepository` - `IExcelExporter` — `ExportAsync(DateRange range, ExportKind kind, string outputPath)` where `ExportKind = PreMatch | Live | Combined` - [x] Implement `MarathonDbContext` in `Marathon.Infrastructure/Persistence/`: - `DbSet`, `DbSet`, `DbSet`, `DbSet`, `DbSet`, `DbSet`, `DbSet` - Configure SQLite with WAL via connection string - Use `EntityTypeConfiguration` classes (one per entity in `Configurations/`) - Map domain types ↔ EF entities via mapping helpers (don't pollute domain) - Indexes: `(EventId)` on `Snapshots` and `Bets`; `(Sport, ScheduledAt)` on `Events` - [x] Implement `Migrations/InitialCreate` migration (hand-written — dotnet ef could not run due to Phase 3 compile errors in the shared Infrastructure project): - `src/Marathon.Infrastructure/Migrations/20260505000000_InitialCreate.cs` - `src/Marathon.Infrastructure/Migrations/MarathonDbContextModelSnapshot.cs` - `src/Marathon.Infrastructure/Persistence/MarathonDbContextFactory.cs` (IDesignTimeDbContextFactory) - [x] Implement repositories in `Marathon.Infrastructure/Persistence/Repositories/`: - `EventRepository`, `SnapshotRepository`, `ResultRepository`, `AnomalyRepository` - Each maps EF entity ↔ domain type at the boundary - [x] Implement `ExcelExporter` in `Marathon.Infrastructure/Export/`: - Uses ClosedXML - Output filename: `Marathon__to_.xlsx` - Two sheets: `PreMatch` and `Live` (or only the selected one based on `ExportKind`) - Wide columns matching customer spec exactly: - Event metadata: `RowNum`, `SportCode`, `Sport`, `Country`, `League`, `Category`, `DateFull`, `Day`, `Month`, `Year`, `Time`, `EventId` - Match-level bets: `Bet_Match_Win_1`, `Bet_Match_Draw`, `Bet_Match_Win_2`, `Bet_Match_Win_Fora_1_Value`, `Bet_Match_Win_Fora_1_Rate`, etc. - Period-N bets: dynamically generated for max periods seen (`Bet_Period-1_Win_1`, ...) - For Live export, prefix with `Live_` instead of `Bet_` - Final column: `WinnerSide` (1 or 2 based on lowest pre-match Win rate, per spec §1.2.4 / §2.2.4) - `BetRowDenormalizer` helper produces `Dictionary` keyed by spec column names - [x] Add DI module `PersistenceModule.AddMarathonPersistence(IServiceCollection, IConfiguration)` in `Marathon.Infrastructure/Persistence/PersistenceModule.cs` (NOT DependencyInjection.cs) that wires up DbContext + repositories + exporter - [x] Tests in `Marathon.Infrastructure.Tests`: - In-memory SQLite (`Microsoft.Data.Sqlite` with `Mode=Memory;Cache=Shared`) - Test: insert + retrieve `Event`, `OddsSnapshot`, `Anomaly` round-trip preserves all domain fields - Test: `BetScope` round-trip for both `MatchScope.Instance` and `new PeriodScope(2)` - Test: `ExcelExporter` sheet names, headers matching spec, row count, filename pattern - Test: WAL pragma executes without error - Tests cannot be RUN due to Phase 3 compile errors blocking the Infrastructure project build ## Files to Modify/Create - `src/Marathon.Application/Abstractions/I*.cs` — repository interfaces - `src/Marathon.Application/ExportKind.cs`, `DateRange.cs` - `src/Marathon.Infrastructure/Persistence/MarathonDbContext.cs` - `src/Marathon.Infrastructure/Persistence/Entities/*.cs` - `src/Marathon.Infrastructure/Persistence/Configurations/*Configuration.cs` - `src/Marathon.Infrastructure/Persistence/Repositories/*Repository.cs` - `src/Marathon.Infrastructure/Persistence/Mapping.cs` — entity ↔ domain - `src/Marathon.Infrastructure/Export/ExcelExporter.cs` - `src/Marathon.Infrastructure/Export/BetRowDenormalizer.cs` - `src/Marathon.Infrastructure/Migrations/*` — EF migrations - `src/Marathon.Infrastructure/DependencyInjection.cs` - `tests/Marathon.Infrastructure.Tests/**` ## Acceptance Criteria - All Infrastructure code compiles (Big Bang: compile-only smoke check OK). - DbContext + repositories cover all domain types. - Excel exporter output matches customer spec column names exactly (no typos in `Bet_Match_Win_Fora_1_Value`, hyphens in `Period-1`, etc.). - Filename includes inclusive date range from event scheduling. ## Notes - This phase is parallelizable with Phase 3 (Scraping) — they touch disjoint files. - `ExcelExporter` uses normalized DB data and produces wide columns — DO NOT store data in wide format in SQLite. - Big Bang: do NOT run full test suite. A `dotnet build` smoke check is acceptable. ## Review Checklist - [ ] Solution builds (compile-only) - [ ] Excel column names match customer spec exactly (cross-check against TZ §1.2 / §2.2) - [ ] Filename pattern matches `Marathon_yyyy-MM-dd_to_yyyy-MM-dd.xlsx` - [ ] No domain types polluted with EF attributes — mapping is in `Configurations/` - [ ] WAL mode enabled in connection string ## Handoff to Next Phase ### Status: ✅ Implementation complete — compile errors are Phase 3 bugs (see Concerns) ### What Phase 4 must know **DI Registration:** Call `services.AddMarathonPersistence(configuration)` in the host's DI setup. This is in `Marathon.Infrastructure.Persistence.PersistenceModule` (NOT `DependencyInjection.cs`). **Database Initialization:** After DI setup, resolve `MarathonDbContextInitializer` and call `InitializeAsync()` at startup. This applies EF migrations and enables `PRAGMA journal_mode=WAL`. **StorageOptions config keys (bind from appsettings.json):** ``` Storage:DatabasePath (default: ./data/marathon.db) Storage:ExportDirectory (default: ./exports) Storage:SnapshotRetentionDays (default: 90) ``` **Repository interfaces (all registered as Scoped):** - `IEventRepository` → `EventRepository` - `ISnapshotRepository` → `SnapshotRepository` - `IResultRepository` → `ResultRepository` - `IAnomalyRepository` → `AnomalyRepository` - `IExcelExporter` → `ExcelExporter` **BetScope persistence:** `(Scope INT, PeriodNumber INT?)`: - `MatchScope.Instance` → `(0, NULL)` - `new PeriodScope(N)` → `(1, N)` **ScheduledAt / CapturedAt / CompletedAt / DetectedAt:** all stored as ISO 8601 TEXT with full offset (e.g., `2026-05-05T20:30:00+03:00`). Sortable lexicographically for SQLite TEXT comparison queries. **Excel exporter:** filename `Marathon_yyyy-MM-dd_to_yyyy-MM-dd.xlsx`, sheets `PreMatch` / `Live`. Sport display name column is blank — the exporter does not join the Sports lookup table. Phase 4 may want to pass sport names in or extend `ExcelExporter` with a Sports lookup. **Migrations:** Hand-written in `src/Marathon.Infrastructure/Migrations/20260505000000_InitialCreate.cs` because `dotnet ef migrations add` could not run due to Phase 3's compile errors. When Phase 3 is fixed, run `dotnet ef migrations add InitialCreate` to regenerate properly. ### Phase 3 bugs that block the full solution build (requires Phase 3 to fix) 1. **`EventId` ambiguity** in `MarathonbetScraper.cs:80` and all `Parsers/*.cs` files: Both `Microsoft.Extensions.Logging.EventId` and `Marathon.Domain.ValueObjects.EventId` are imported. Fix: add `using DomainEventId = Marathon.Domain.ValueObjects.EventId;` and replace `EventId` usages in Phase 3 files. 2. **`Configuration.Default` ambiguity** in `EventListingParserBase.cs:37` and `EventOddsParser.cs`: `AngleSharp.Configuration` is shadowed by the `Marathon.Infrastructure.Configuration` namespace. Fix: replace `Configuration.Default` with `AngleSharp.Configuration.Default` in Phase 3 files. 3. **`IOddsScraper` interface mismatch** (`CS0535`) in `MarathonbetScraper.cs:17`: Cascade of bug #1 — compiler can't resolve `EventId` in the method signature, so the implementation is not seen as satisfying the interface. Fixing bug #1 resolves this too.