25489a733a
content/phys/ct-2024.yaml — 15 questions from ЦЭ,ЦТ 2024 across 6 topics (kinem, mol, emf, electro, magnet, optics) as proof of format. backend/scripts/import-content.js — unified importer: - Validates schema (subject, year, options, exactly-1-correct) - Aliases (kinem, mol, ...) resolve to Russian topic names via get-or-create - Deduplicates by first 80 chars of text (matches legacy seed_*.js behavior) - Runs in a single transaction, idempotent re-runs On fresh DB: 13 added (2 dedup collisions — same 80-char prefix, expected). On prod DB: 0 added (all already exist from legacy seeds). Second run on either: 0 added (dedup works). Legacy seed_phys_ct2024.js kept as backup — see content/README.md for migration guide. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
81 lines
2.9 KiB
Markdown
81 lines
2.9 KiB
Markdown
# Content as data
|
|
|
|
Question collections live here as YAML, imported via a single CLI.
|
|
This replaces the ad-hoc `backend/scripts/seed_phys_ct2024.js` pattern.
|
|
|
|
## Import command
|
|
|
|
```sh
|
|
cd backend
|
|
npm run import:content -- ../content/phys/ct-2024.yaml
|
|
```
|
|
|
|
## File format
|
|
|
|
```yaml
|
|
meta:
|
|
subject: phys # phys | math | bio | chem
|
|
year: 2024 # exam year (integer)
|
|
source: "ЦЭ,ЦТ 2024" # optional label shown in import log
|
|
|
|
topics:
|
|
kinem: # topic alias (see aliases below)
|
|
- text: |
|
|
Question text (multi-line supported, LaTeX with \( \) works)
|
|
difficulty: 1 # 1=easy, 2=medium, 3=hard (default: 1)
|
|
explanation: "Solution explanation" # optional
|
|
options:
|
|
- { text: "Answer A", correct: true } # exactly ONE correct
|
|
- { text: "Answer B" }
|
|
- { text: "Answer C" }
|
|
|
|
"Full topic name": # or use full Russian name — will be found or created
|
|
- text: "..."
|
|
options: [...]
|
|
```
|
|
|
|
## Topic aliases (subject=phys)
|
|
|
|
| Alias | Topic name |
|
|
|----------|---------------------------------|
|
|
| kinem | Кинематика |
|
|
| dynam | Динамика |
|
|
| cons | Законы сохранения |
|
|
| mol | Молекулярная физика |
|
|
| thermo | Термодинамика |
|
|
| electro | Электростатика |
|
|
| dc | Постоянный ток |
|
|
| magnet | Магнетизм |
|
|
| emf | Электромагнитная индукция |
|
|
| optics | Оптика |
|
|
| quantum | Квантовая и ядерная физика |
|
|
| waves | Колебания и волны |
|
|
|
|
For other topic names, use the full Russian name as the key — the importer
|
|
looks it up in the database (case-insensitive) or creates a new topic.
|
|
|
|
## Dedup logic
|
|
|
|
Questions are skipped if the first 80 characters of their text already
|
|
exist in the database for the same subject. This matches the legacy
|
|
`seed_phys_*.js` behavior, ensuring idempotent re-runs.
|
|
|
|
## Migrating a legacy seed_*.js
|
|
|
|
1. Copy the file structure from `content/phys/ct-2024.yaml`
|
|
2. Convert each `q(T.kinem, text, opts, diff, year, expl)` call to YAML:
|
|
- `T.kinem` → `topics: kinem:`
|
|
- `text` → `text: |` (use literal block for multi-line)
|
|
- `opts: [{t: "...", c: true}, ...]` → `options: [{text: "...", correct: true}, ...]`
|
|
- `diff` → `difficulty:`
|
|
- `expl` → `explanation:`
|
|
3. Run `npm run import:content -- ../content/<subject>/<file>.yaml`
|
|
4. Verify output shows expected `added` count
|
|
5. Keep the legacy `seed_*.js` file as backup until verified
|
|
|
|
## Collections
|
|
|
|
| File | Subject | Year | Source | Questions |
|
|
|------|---------|------|--------|-----------|
|
|
| phys/ct-2024.yaml | Физика | 2024 | ЦЭ,ЦТ 2024 | 13 (proof subset) |
|