Skip to main content

AI and Sage

The Sage is the app's AI layer for rules chat, campaign prep, document understanding, recap extraction, NPC generation, loot, tables, and encounter suggestions. Most of it lives under server/app/lib/rag/.

Configuration

Important settings are defined in server/app/config.py:

SettingPurpose
ANTHROPIC_API_KEYEnables Anthropic-backed generation and parsing.
GENERATION_PROVIDERauto, anthropic, or fake.
GENERATION_MODELModel used for rules/chat generation.
PARSING_MODELModel used for Anthropic vision PDF parsing.
VOYAGE_API_KEYEnables hosted Voyage embeddings.
EMBEDDINGS_PROVIDERauto, voyage, local, or fake.
LANGFUSE_*Optional production tracing.
LANGSMITH_*Optional LangGraph tracing and Studio support.

Tests use deterministic fake providers where possible.

Document Ingestion

Document ingestion is implemented in server/app/lib/rag/ingest.py:

  1. Extract Markdown from an uploaded PDF or Markdown file.
  2. Chunk Markdown with chunk_markdown.
  3. Optionally generate an overview chunk for campaign documents.
  4. Embed chunks.
  5. Replace old chunks for the document.
  6. Best-effort extract monster stat blocks.
  7. Mark the document indexed or failed.

Core-rule files are private under server/storage/. Campaign uploads are scoped to their campaign. The Process action reuses saved extracted Markdown and skips source parsing.

Retrieval

Retrieval is in server/app/lib/rag/retrieve.py.

It combines:

  • Dense vector search through pgvector.
  • PostgreSQL full-text search.
  • Reciprocal Rank Fusion.
  • Reranking.
  • Query alias expansion for common D&D vocabulary gaps.

Scope is enforced in SQL. A campaign can retrieve its own campaign documents and allowed core rules: required books, plus optional books opted into by the campaign. Tool-only books are mined for structured data but are not retrieved as chat sources.

Rules Chat

The non-streaming rules path in server/app/lib/rag/chat.py follows:

load history -> retrieve sources -> inject played-session recaps -> generate
-> guardrail citations -> persist user and assistant turns

The guardrail only keeps answers that cite valid sources. Otherwise the Sage abstains.

The streaming path wraps the LangGraph graph and forwards custom stream events to the SSE endpoint.

LangGraph Chat Modes

The graph lives in server/app/lib/rag/graph.py and is exposed to LangGraph Studio through server/langgraph.json.

router -> rules | prep | recap

The router resolves an explicit mode if provided; otherwise it lets the generator classify the turn. The streaming adapter resolves mode before opening the trace so Langfuse tags cover the whole operation.

Current event types include source lists, answer deltas, done/error markers, dice rolls, stat blocks, encounter cards, NPC cards, loot cards, and rollable tables.

Tools and Prep Outputs

Rules mode can short-circuit into tools:

  • roll_dice
  • get_stat_block

Prep mode classifies planning requests and dispatches to specialized builders:

  • Encounter suggestions: deterministic difficulty math plus LLM selection.
  • NPC generation: optionally grounded in retrieved campaign context.
  • Loot generation: deterministic treasure budget plus LLM contents.
  • Rollable tables: LLM table contents normalized into roll ranges.

The pattern is deliberate: deterministic math and validation stay in code; judgment, naming, and flavor can come from the model.

Recap Extraction

Played-session recaps can produce reviewed Codex changesets. The extractor proposes operations for story entities, NPCs, and relationships. DMs review and apply or dismiss the changes before campaign state changes.

Long-running extraction work must not block the event loop. Blocking model calls are offloaded with asyncio.to_thread.

AI Access

AI features are gated server-side. Effective access is:

user.aiAccess OR user.isAdmin

Do not rely only on hidden frontend controls. Add backend guards to every new AI endpoint.