AI and Sage
The Sage is the app's AI layer for rules chat, campaign prep, document
understanding, recap extraction, NPC generation, loot, tables, and encounter
suggestions. Most of it lives under server/app/lib/rag/.
Configuration
Important settings are defined in server/app/config.py:
| Setting | Purpose |
|---|---|
ANTHROPIC_API_KEY | Enables Anthropic-backed generation and parsing. |
GENERATION_PROVIDER | auto, anthropic, or fake. |
GENERATION_MODEL | Model used for rules/chat generation. |
PARSING_MODEL | Model used for Anthropic vision PDF parsing. |
VOYAGE_API_KEY | Enables hosted Voyage embeddings. |
EMBEDDINGS_PROVIDER | auto, voyage, local, or fake. |
LANGFUSE_* | Optional production tracing. |
LANGSMITH_* | Optional LangGraph tracing and Studio support. |
Tests use deterministic fake providers where possible.
Document Ingestion
Document ingestion is implemented in server/app/lib/rag/ingest.py:
- Extract Markdown from an uploaded PDF or Markdown file.
- Chunk Markdown with
chunk_markdown. - Optionally generate an overview chunk for campaign documents.
- Embed chunks.
- Replace old chunks for the document.
- Best-effort extract monster stat blocks.
- Mark the document indexed or failed.
Core-rule files are private under server/storage/. Campaign uploads are scoped
to their campaign. The Process action reuses saved extracted Markdown and
skips source parsing.
Retrieval
Retrieval is in server/app/lib/rag/retrieve.py.
It combines:
- Dense vector search through pgvector.
- PostgreSQL full-text search.
- Reciprocal Rank Fusion.
- Reranking.
- Query alias expansion for common D&D vocabulary gaps.
Scope is enforced in SQL. A campaign can retrieve its own campaign documents and allowed core rules: required books, plus optional books opted into by the campaign. Tool-only books are mined for structured data but are not retrieved as chat sources.
Rules Chat
The non-streaming rules path in server/app/lib/rag/chat.py follows:
load history -> retrieve sources -> inject played-session recaps -> generate
-> guardrail citations -> persist user and assistant turns
The guardrail only keeps answers that cite valid sources. Otherwise the Sage abstains.
The streaming path wraps the LangGraph graph and forwards custom stream events to the SSE endpoint.
LangGraph Chat Modes
The graph lives in server/app/lib/rag/graph.py and is exposed to LangGraph
Studio through server/langgraph.json.
router -> rules | prep | recap
The router resolves an explicit mode if provided; otherwise it lets the generator classify the turn. The streaming adapter resolves mode before opening the trace so Langfuse tags cover the whole operation.
Current event types include source lists, answer deltas, done/error markers, dice rolls, stat blocks, encounter cards, NPC cards, loot cards, and rollable tables.
Tools and Prep Outputs
Rules mode can short-circuit into tools:
roll_diceget_stat_block
Prep mode classifies planning requests and dispatches to specialized builders:
- Encounter suggestions: deterministic difficulty math plus LLM selection.
- NPC generation: optionally grounded in retrieved campaign context.
- Loot generation: deterministic treasure budget plus LLM contents.
- Rollable tables: LLM table contents normalized into roll ranges.
The pattern is deliberate: deterministic math and validation stay in code; judgment, naming, and flavor can come from the model.
Recap Extraction
Played-session recaps can produce reviewed Codex changesets. The extractor proposes operations for story entities, NPCs, and relationships. DMs review and apply or dismiss the changes before campaign state changes.
Long-running extraction work must not block the event loop. Blocking model calls
are offloaded with asyncio.to_thread.
AI Access
AI features are gated server-side. Effective access is:
user.aiAccess OR user.isAdmin
Do not rely only on hidden frontend controls. Add backend guards to every new AI endpoint.