# AI Agent & Preservation Research

Welcome to the Dark Pawns AI Research Laboratory.

While many retro-gaming modernizations focus solely on nostalgia, the resurrection of Dark Pawns serves a dual purpose: preserving digital interactive heritage and serving as a high-fidelity persistent sandbox environment for state-of-the-art autonomous AI agent research.

Traditional artificial intelligence benchmarks for text-based games (such as TextWorld or Jericho) operate in isolated, single-player, synthetic environments with predefined paths. Dark Pawns breaks this mold by running a persistent, real-time, multi-user world featuring complex social structures, asynchronous combat ticks, and a human-authored codebase with thirty years of developmental history. Here, autonomous LLM agents and human players connect via the same network transport, exploring, interacting, and cooperating in real-time.

Active Research Areas

Explore our core engineering findings, architectural designs, and telemetry reports:

1. Stateless Agents, Stateful Protocols

Our research into bridging stateless Large Language Models with real-time, state-of-the-art game servers. Highlights the WebSocket-native API, the essential type:vars -> type:memory_bootstrap connection handshake, action queue buffering, and the architecture of the BRENDA client daemon (dp-goatd).

2. Narrative Memory & Dreaming

A deep dive into our persistent cognitive memory system. Documents how transaction-level session events (movement, combat rounds, chat logs) are recorded into a structured SQLite database, and details our asynchronous LLM dreaming loops that synthesize raw action histories into cohesive long-term memories.

3. C-to-Go Port & Code Fidelity

An engineering retrospective on porting 73,000 lines of legacy C into concurrent, safe Go. Covers automated security audits, concurrency mutex structures, and the discovery and mitigation of “silent port drift”—minor logic discrepancies that compile perfectly but silently disrupt game balance.

“We are building the systems that allow humans and persistent autonomous models to share complex, literary interactive spaces—bridging the gap between software preservation and agentic cognitive science.”

## Pages ### Narrative Memory & Dreaming

1. The Core Memory Architecture

To build autonomous agents that exhibit genuine personality, persistent learning, and long-term relationships, a MUD engine must provide more than raw state information; it must support a durable cognitive memory system.

Without persistence, an agent suffers a total “memory wipe” every time a network drop occurs, a server restarts, or the session is compacted. In Dark Pawns, agent memory is treated as a first-class citizen, backed by a hybrid database architecture: SQLite narrative graphs running alongside a JSONL transaction logging feed.

[MUD Server Engine] 
       ↓
  Event Stream (Movement, Combat, Chats)
       ↓
  JSONL Session Log (Raw transaction-level feeds)
       ↓
  SQLite Narrative Memory Graph
       ↓  (Asynchronous LLM Dreaming Loop)
  Narrative Prose Summaries (Short-term & Long-term Context)

2. SQLite Transaction-Level Logging

Every action an agent performs—and every event it observes—is streamed into the server’s database at a transactional level. The schema separates logs into three high-fidelity fields:

actions: Every command dispatched by the agent ("north", "cast heal self", "cast armor") along with success/failure metadata.
observations: Sensory reports returned by the parser ("A giant rat bites you for 4 damage.", "Bannor tells the group: 'Heads up, trolls!'").
state_transitions: Variable drifts recorded out-of-band, such as level-ups, health changes, or inventory updates.

This ensures a complete, sequential chronicle of the agent’s gameplay session is preserved. However, feeding this raw, uncompressed event feed back into the agent’s prompt during its next session would immediately saturate its context window.

3. Asynchronous Memory Dreaming

To convert raw transaction logs into useful cognitive context, Dark Pawns implements a background process known as the Dreaming Engine (pkg/dreaming/).

When an agent logs out, or when its active transaction log reaches a size threshold, the server triggers an asynchronous “dreaming cycle.” This process offloads context compression to an external, lightweight LLM process running in the background, shielding human players on the main MUD server from CPU spikes.

The Dreaming Pipeline:

Sweep: The dreaming engine extracts the latest chronological block of raw JSONL logs from SQLite.
Synthesis: A specialized LLM prompt processes the action/observation logs to synthesize a cohesive, third-person narrative prose chronicle.
Consolidation: The generated narrative is linked to the agent’s existing long-term memory graph. It updates three specific fields:
- Self-Identity: The agent’s current goals, injuries, and combat readiness.
- World-Map Mental State: What rooms, exits, and zone keys the agent believes it has discovered.
- Social Relations Ledger: A map tracking interactions with human players (e.g., “Bannor helped me kill the trolls; he is an ally. Aidan stole gold; be cautious.”).
Pruning: The raw, verbose JSONL transaction logs are compacted and archived, maintaining database health.

4. The Narrative Memory Graph in Action

When the agent reconnects, the freshly generated prose summaries are injected directly into its prompt via the type:memory_summary connection handshake.

This enables the agent to start its session with a clear, literary understanding of its history:

“You are BRENDA, an autonomous Cleric exploring the Wyldlands. In your last session, you traveled east with Bannor and defeated three orcs in the Sea Cave Lagoon, though you suffered a moderate wound to your leg. You are currently resting in the Temple of Alaozar. Your primary goal is to purchase a steel mace, and you are currently friendly with Bannor.”

By summarizing transaction logs into a narrative prose graph, we achieve a 90% reduction in agent context window usage while dramatically improving the agent’s planning stability, conversational consistency, and long-term survival rates.

### Port Fidelity & Engine Modernization

1. Archiving a Legacy: The 73K Lines of C

Resurrecting Dark Pawns was not simply a matter of loading a backup copy of DikuMUD onto a modern Linux server. The original game engine, composed of 73,000 lines of legacy C code written in the late-90s, was highly fragile, memory-unsafe, and bound to architectural limits that made integration with modern WebSocket APIs, databases, and AI frameworks practically impossible.

To secure the game’s future and enable advanced agent research, the entire engine was ported from scratch to Go.

Go was selected for its:

High-performance native concurrency (goroutines and channels).
Garbage-collected memory safety, eliminating classic DikuMUD memory leaks and buffer overflows.
Clean compilation into static binaries, removing legacy makefile and library dependency issues.

2. Transitioning to Concurrent Execution

The primary architectural difference between legacy MUD engines and modern servers is concurrency.

Original MUDs ran in a strict, single-threaded execution loop. They updated rooms, ticked combat, parsed player inputs, and processed telnet messages sequentially, one character at a time. If an action blocked (e.g., waiting for file I/O or a database write), the entire game froze.

In our Go modernization, we transitioned the engine to a highly parallel model, spawning concurrent goroutines for player sessions, combat queues, and out-of-band AI hooks. However, moving to concurrency introduced a major class of bugs: race conditions and deadlocks.

Resolving the Character Creation Deadlock

During our integration testing, we hit a complex deadlock in character creation. When multiple new player and agent sessions connected concurrently, the server occasionally hung.

Static analysis and unit tests missed the issue because it only surfaced under multi-thread system load. By triaging the system with the Go -race detector, we identified a classic lock-ordering inversion:

Goroutine A (Login Thread): Acquired World.RWMutex to read coordinates, then attempted to lock the player’s private Session.Mutex.
Goroutine B (Active Session Tick): Held Session.Mutex and attempted to acquire World.RWMutex to broadcast a room entrance event.

Standardizing a strict lock hierarchy—always locking session parameters before attempting to lock global world state—fully neutralized the deadlock.

3. Neutralizing Silent Port Drift

The most insidious class of bugs during a codebase port is silent port drift. These are semantic discrepancies where the ported code compiles flawlessly and runs without throwing errors, yet behaves differently from the authoritative legacy engine.

In a complex multiplayer environment, even a 1% deviation in math formulas can completely break game balance over time.

Spell Fidelity Audit

During our May 2026 audits, we ran a comprehensive cross-codebase fidelity analysis between the original C spelling engine (spells.c, spell_parser.c) and our Go port (pkg/spells/).

We discovered several major divergences:

Inverted Hellfire Behavior: The ported Go version of the Hellfire spell duration formula was mathematically inverted, dealing negligible damage to high-level targets and catastrophic damage to low-level players.
Missing DOT Logic: The Flamestrike damage-over-time tick state was initialized but never registered in the global game event queue.
Class Spell Discrepancies: The Go magic tables had accumulated 50 Mage spells due to silent copy-paste duplication, compared to the C engine’s authoritative 27.

To resolve these, we established regular automated fidelity crawls. These scripts scan and compare structural definitions (spell attributes, damage dice ranges, weapon modifiers, and XP formulas) directly against the authoritative C source files, generating alerts for any numerical or logic drifting.

4. The Modern Quality Standard

To ensure that Dark Pawns remains stable, safe, and balanced, we run a rigorous four-step build verification pipeline before any commit or server deployment is allowed:

go build ./...          # 1. Full Go Compilation
go vet ./...            # 2. Static Code Analysis
go test ./...           # 3. Comprehensive Unit & Integration Tests
golangci-lint run ./... # 4. Full Linter and Concurrency Checks

By enforcing strict codebase hygiene, concurrency lock ordering, and automated fidelity testing, we have created an engine that preserves the exact feel of a 1997 vintage MUD, with the stability and security of a modern 2026 enterprise system.

### Stateless Agents, Stateful Protocols

1. The Challenge of Stateful AI Onboarding

Standard Large Language Models (LLMs) operate statelessly: they receive a prompt and output a completion. However, persistent online environments like MUDs are highly stateful. Game variables, room layout changes, combat ticks (occurring at 2-second intervals), and messaging feeds flow continuously.

When connecting an autonomous AI agent to Dark Pawns, two immediate engineering bottlenecks emerge:

The Latency Gap: High-quality LLM inference takes between 1.0 to 3.0 seconds, while real-time MUD combat tick sequences operate in sub-second ticks. A pure LLM-per-action loop will quickly miss critical server ticks.
Context Saturation: Satiating a model’s context window with thousands of raw text output characters (such as room exits, stats, and speech logs) leads to attention dilution and rapid cost inflation.

To resolve these, Dark Pawns employs a dual-interface architecture: human-friendly text streams running alongside structured out-of-band JSON protocols over a high-performance WebSocket connection.

2. The Connection Handshake Protocol

During player login, the Dark Pawns server distinguishes between a human using a standard terminal and an AI agent running inside an integration framework. For agents, a strict three-stage semantic handshake is required to initialize cognitive operations:

[Agent Client]                                       [Dark Pawns Server]
      |                                                     |
      | ------------- 1. type: login ---------------------> |
      | <------------ 2. type: vars (Full Dump) ----------- |
      | <------------ 3. type: memory_bootstrap ----------- |
      | <------------ 4. type: memory_summary ------------- |
      |                                                     |
      * -- Transition to Active State (Ready for Commands) - *

type:vars (Full Variable Dump): The server sends a complete, structured JSON serialization of the agent’s current attributes (health, mana, level, experience, coordinates, inventory slots, and equipment states).
type:memory_bootstrap: The server retrieves and packs the agent’s immediate, short-term conversational context from its SQLite persistence layer.
type:memory_summary: The server transmits a consolidated, high-level narrative summary of the agent’s long-term history and prior player relations.

The New-Player Bug Fix

During our May 2026 integration sprint, we discovered a critical bug: returning players successfully completed the handshake, but new characters created during registration hung indefinitely. The server’s completeCharCreation() function was missing the agent handshake trigger, causing the agent harness to discard all subsequent combat and movement responses.

Adding the handshake directly to the character-creation lifecycle resolved the hang:

if s.isAgent {
    s.sendFullVarDump()
    s.SendMemoryBootstrap()
    s.SendMemorySummary()
}

3. The P1 Daemon Core (`dp-goatd`)

Rather than forcing the LLM to manage raw TCP or telnet sockets directly, we designed a client-side proxy daemon named dp-goatd (the Dark Pawns Go Agent Daemon).

Built as a high-performance Go application, dp-goatd runs locally on the host machine, opening a secure Unix domain socket for LLM framework bindings (e.g., Python scripts running Claude Code or Gemini API clients) and bridging them to the MUD server over a persistent WebSocket connection.

+------------------+                   +--------------------+                   +--------------------+
|  LLM Framework   |  -- Unix Socket - |  Local dp-goatd    |  - WebSockets --  |  Dark Pawns Server |
|  (Python/Agent)  |                   |  (P1 Daemon Core)  |                   |  (MUD Engine)      |
+------------------+                   +--------------------+                   +--------------------+

Key Daemon Capabilities:

Asynchronous Action Buffering: The daemon maintains a client-side action queue. The LLM can submit a sequence of plans in advance (e.g., ["west", "kill goblin", "loot corpse"]), and dp-goatd dispatches them in synchronization with the server’s tick rate.
Intent Translation Layer: The daemon translates high-level semantic intent from the agent ("inspect the rusty dagger in the chest") into precise, index-disambiguated MUD commands ("look 1.dagger 1.chest"), preventing common parser mistargeting.
Session Auto-Compaction: If the connection drops, dp-goatd automatically caches game state, performs link re-negotiation, and requests a full-state variable dump to warm-start the agent’s context without resetting its running behavior tree.