Files
maraphon-app/plans/initial-implementation/phase-2-storage.md
T

116 lines
5.8 KiB
Markdown

# Phase 2: Infrastructure — Storage
**Status:** ⬜ Not Started
**Parent plan:** [PLAN.md](./PLAN.md)
**Domain:** backend
## Objective
Implement persistent storage: EF Core + SQLite (WAL) with migrations, repository
implementations of the Application layer's interfaces, and a ClosedXML-based Excel
exporter that produces files matching the customer's wide-column spec with date-range
filenames.
## Tasks
- [ ] Add packages to `Marathon.Infrastructure` (via `Directory.Packages.props`):
- `Microsoft.EntityFrameworkCore`
- `Microsoft.EntityFrameworkCore.Sqlite`
- `Microsoft.EntityFrameworkCore.Design`
- `ClosedXML`
- [ ] Add Application-layer abstractions in `Marathon.Application/Abstractions/`:
- `IRepository<TKey, TEntity>` — generic CRUD: `GetAsync`, `ListAsync`,
`AddAsync`, `UpdateAsync`, `DeleteAsync`, `SaveChangesAsync`
- `IEventRepository : IRepository<EventId, Event>` — adds `ListByDateRangeAsync`,
`ListBySportAsync`
- `ISnapshotRepository : IRepository<Guid, OddsSnapshot>` — adds
`ListByEventAsync(EventId, DateTimeOffset from, DateTimeOffset to)`
- `IResultRepository : IRepository<EventId, EventResult>`
- `IAnomalyRepository : IRepository<Guid, Anomaly>`
- `IExcelExporter``ExportAsync(DateRange range, ExportKind kind, string outputPath)`
where `ExportKind = PreMatch | Live | Combined`
- [ ] Implement `MarathonDbContext` in `Marathon.Infrastructure/Persistence/`:
- `DbSet<EventEntity>`, `DbSet<SnapshotEntity>`, `DbSet<BetEntity>`,
`DbSet<EventResultEntity>`, `DbSet<AnomalyEntity>`, `DbSet<SportEntity>`,
`DbSet<LeagueEntity>`
- Configure SQLite with WAL via connection string
- Use `EntityTypeConfiguration<T>` classes (one per entity in `Configurations/`)
- Map domain types ↔ EF entities via mapping helpers (don't pollute domain)
- Indexes: `(EventId)` on `Snapshots` and `Bets`; `(Sport, ScheduledAt)` on `Events`
- [ ] Implement `Migrations/InitialCreate` migration (EF Core CLI):
```
dotnet ef migrations add InitialCreate --project src/Marathon.Infrastructure
```
- [ ] Implement repositories in `Marathon.Infrastructure/Persistence/Repositories/`:
- `EventRepository`, `SnapshotRepository`, `ResultRepository`, `AnomalyRepository`
- Each maps EF entity ↔ domain type at the boundary
- [ ] Implement `ExcelExporter` in `Marathon.Infrastructure/Export/`:
- Uses ClosedXML
- Output filename: `Marathon_<from yyyy-MM-dd>_to_<to yyyy-MM-dd>.xlsx`
- Two sheets: `PreMatch` and `Live` (or only the selected one based on `ExportKind`)
- Wide columns matching customer spec exactly:
- Event metadata: `RowNum`, `SportCode`, `Sport`, `Country`, `League`, `Category`,
`DateFull`, `Day`, `Month`, `Year`, `Time`, `EventId`
- Match-level bets: `Bet_Match_Win_1`, `Bet_Match_Draw`, `Bet_Match_Win_2`,
`Bet_Match_Win_Fora_1_Value`, `Bet_Match_Win_Fora_1_Rate`, etc.
- Period-N bets: dynamically generated for max periods seen (`Bet_Period-1_Win_1`, ...)
- For Live export, prefix with `Live_` instead of `Bet_`
- Final column: `WinnerSide` (1 or 2 based on lowest pre-match Win rate, per spec
§1.2.4 / §2.2.4)
- Implement a `BetRowDenormalizer` helper that takes a `List<Bet>` and produces a
flat `Dictionary<string, object?>` keyed by spec column names.
- [ ] Add a DI extension `AddMarathonInfrastructure(IServiceCollection, IConfiguration)`
in `Marathon.Infrastructure/DependencyInjection.cs` that wires up DbContext +
repositories + exporter using `IConfiguration` for `Storage:DatabasePath` and
`Storage:ExportDirectory`.
- [ ] Tests in `Marathon.Infrastructure.Tests`:
- In-memory SQLite (`Microsoft.Data.Sqlite` with `Mode=Memory;Cache=Shared`)
- Test: insert + retrieve `Event`, `OddsSnapshot`, `Anomaly` round-trip preserves all
domain fields
- Test: `ExcelExporter` generates a workbook with the expected sheet names, headers
matching spec, and row count matching event count
- Test: filename pattern matches `Marathon_yyyy-MM-dd_to_yyyy-MM-dd.xlsx`
- Test: WAL mode is enabled after open
## Files to Modify/Create
- `src/Marathon.Application/Abstractions/I*.cs` — repository interfaces
- `src/Marathon.Application/ExportKind.cs`, `DateRange.cs`
- `src/Marathon.Infrastructure/Persistence/MarathonDbContext.cs`
- `src/Marathon.Infrastructure/Persistence/Entities/*.cs`
- `src/Marathon.Infrastructure/Persistence/Configurations/*Configuration.cs`
- `src/Marathon.Infrastructure/Persistence/Repositories/*Repository.cs`
- `src/Marathon.Infrastructure/Persistence/Mapping.cs` — entity ↔ domain
- `src/Marathon.Infrastructure/Export/ExcelExporter.cs`
- `src/Marathon.Infrastructure/Export/BetRowDenormalizer.cs`
- `src/Marathon.Infrastructure/Migrations/*` — EF migrations
- `src/Marathon.Infrastructure/DependencyInjection.cs`
- `tests/Marathon.Infrastructure.Tests/**`
## Acceptance Criteria
- All Infrastructure code compiles (Big Bang: compile-only smoke check OK).
- DbContext + repositories cover all domain types.
- Excel exporter output matches customer spec column names exactly (no typos in
`Bet_Match_Win_Fora_1_Value`, hyphens in `Period-1`, etc.).
- Filename includes inclusive date range from event scheduling.
## Notes
- This phase is parallelizable with Phase 3 (Scraping) — they touch disjoint files.
- `ExcelExporter` uses normalized DB data and produces wide columns — DO NOT store
data in wide format in SQLite.
- Big Bang: do NOT run full test suite. A `dotnet build` smoke check is acceptable.
## Review Checklist
- [ ] Solution builds (compile-only)
- [ ] Excel column names match customer spec exactly (cross-check against TZ §1.2 / §2.2)
- [ ] Filename pattern matches `Marathon_yyyy-MM-dd_to_yyyy-MM-dd.xlsx`
- [ ] No domain types polluted with EF attributes — mapping is in `Configurations/`
- [ ] WAL mode enabled in connection string
## Handoff to Next Phase
<!-- Filled by Phase 2 implementer. -->