feat(initial-implementation): phase 0 - scraping spike findings

Anonymous scraping confirmed feasible for marathonbet.by — site is fully SSR
(nginx), no Cloudflare or JS challenge. HttpClient + AngleSharp + Polly v8 is
sufficient; Playwright not required (kept as a future-flag).

Spike outputs:
- spike/SCRAPE_FINDINGS.md  — page rendering, URL templates, anti-bot, rate
  limits, recommended scraping strategy for Phase 3.
- spike/SCHEMA_DRAFT.md     — customer-spec field → DOM selector mapping for
  Match + Period-N scope across football/basketball/tennis (hockey TBD).

Phase 1+ handoff captured in subplan + CLAUDE.md. Critical Phase 8 finding:
no public results endpoint at /su/results — phase 8 must switch to polling
event-detail until eventJsonInfo.matchIsComplete=true (deviation flagged).

Reviewer notes addressed:
- Period market outcome codes corrected to RN_H/RN_D/RN_A (not 1/draw/3) and
  market name vocabulary clarified per-sport in SCHEMA_DRAFT §3.1.
- results-page.html capture added to file list with caveat about live-landing
  score-state and unsampled hockey selectors.
This commit is contained in:
2026-05-05 01:04:03 +03:00
parent 8802ddb25b
commit 070e34b911
6 changed files with 864 additions and 25 deletions
+2 -2
View File
@@ -34,7 +34,7 @@ parameter configurable.
## Phases
- [ ] Phase 0: Scraping spike (research, throwaway) [domain: backend] → [subplan](./phase-0-scraping-spike.md)
- [x] Phase 0: Scraping spike (research, throwaway) [domain: backend] → [subplan](./phase-0-scraping-spike.md)
- [ ] Phase 1: Solution skeleton + Domain model [domain: backend] → [subplan](./phase-1-solution-and-domain.md)
- [ ] Phase 2: Infrastructure — Storage [domain: backend] → [subplan](./phase-2-storage.md)
- [ ] Phase 3: Infrastructure — Scraping [domain: backend] → [subplan](./phase-3-scraping.md)
@@ -62,7 +62,7 @@ parameter configurable.
| Phase | Domain | Status | Review | Build | Committed |
|---|---|---|---|---|---|
| Phase 0: Scraping spike | backend | ⬜ Not Started | ⬜ | ⬜ | ⬜ |
| Phase 0: Scraping spike | backend | ✅ Done | ⬜ Pending review | ⏭️ N/A (research) | ⬜ |
| Phase 1: Solution + Domain | backend | ⬜ Not Started | ⬜ | ⬜ | ⬜ |
| Phase 2: Storage | backend | ⬜ Not Started | ⬜ | ⏭️ Big Bang | ⬜ |
| Phase 3: Scraping | backend | ⬜ Not Started | ⬜ | ⏭️ Big Bang | ⬜ |