Testing Strategy

The project uses focused unit tests, backend integration tests against PostgreSQL, frontend component tests, frontend production build checks, and AI eval harnesses.

Backend Tests

Run from server/:

uv run pytest
uv run ruff check .

Backend tests cover auth, schemas, JWTs, uploads, combat, encounters, generation helpers, retrieval/chunking, eval metrics, recap extraction, and integration routes.

Integration tests use a real PostgreSQL database. Keep them scoped and avoid network calls unless the test explicitly targets an integration requiring them.

Frontend Checks

Run from web/:

npm run build

The build catches TypeScript and bundling errors. Component tests live beside components and screens where present.

Docs Checks

Run from the root:

npm run docs:build

This catches broken Docusaurus config, malformed MDX, and broken strict links.

AI Evals

Eval harnesses live in server/evals/. Use them when changing retrieval, generation prompts, citation logic, encounter math, or NPC quality scoring.

Judge-free metrics should stay deterministic. LLM-judge rubrics should be versioned and validated against golden examples.

What to Test for a Change

Change	Minimum verification
Backend route	Targeted pytest plus full backend test run when practical.
Pydantic schema	Schema tests and affected route tests.
Serializer/DTO	Backend serializer tests or route tests plus frontend build.
Frontend component	Component test if behavior is non-trivial plus frontend build.
Database model	Migration upgrade plus affected integration tests.
RAG/generation	Unit tests with fake providers plus relevant eval when useful.
Documentation	`npm run docs:build`.

Backend Tests​

Frontend Checks​

Docs Checks​

AI Evals​

What to Test for a Change​

Backend Tests

Frontend Checks

Docs Checks

AI Evals

What to Test for a Change