Skip to main content

Testing Strategy

The project uses focused unit tests, backend integration tests against PostgreSQL, frontend component tests, frontend production build checks, and AI eval harnesses.

Backend Tests

Run from server/:

uv run pytest
uv run ruff check .

Backend tests cover auth, schemas, JWTs, uploads, combat, encounters, generation helpers, retrieval/chunking, eval metrics, recap extraction, and integration routes.

Integration tests use a real PostgreSQL database. Keep them scoped and avoid network calls unless the test explicitly targets an integration requiring them.

Frontend Checks

Run from web/:

npm run build

The build catches TypeScript and bundling errors. Component tests live beside components and screens where present.

Docs Checks

Run from the root:

npm run docs:build

This catches broken Docusaurus config, malformed MDX, and broken strict links.

AI Evals

Eval harnesses live in server/evals/. Use them when changing retrieval, generation prompts, citation logic, encounter math, or NPC quality scoring.

Judge-free metrics should stay deterministic. LLM-judge rubrics should be versioned and validated against golden examples.

What to Test for a Change

ChangeMinimum verification
Backend routeTargeted pytest plus full backend test run when practical.
Pydantic schemaSchema tests and affected route tests.
Serializer/DTOBackend serializer tests or route tests plus frontend build.
Frontend componentComponent test if behavior is non-trivial plus frontend build.
Database modelMigration upgrade plus affected integration tests.
RAG/generationUnit tests with fake providers plus relevant eval when useful.
Documentationnpm run docs:build.