Multi-fidelity agent simulation scales human behavior modeling by matching compute to decision importance. Routine actions run through rules, mid-level actions use heuristics, and high-impact moments call LLM agents for reasoning.
Key takeaways
- Do not use an LLM for every tick; route cognition by importance.
- Use event-driven execution to wake agents only when context changes.
- Separate simulated time from wall-clock latency with clear fidelity levels.
The scaling problem in LLM agent simulations
LLM-driven simulations break down when every agent reasons on every time step. A small world can feel coherent in a demo, but long timelines create latency, cost, and state drift.
The core issue is that LLM latency is not simulated time. A simulated hour may contain hundreds of small actions, but only a few of them require language reasoning, social judgment, or strategic planning.
What multi-fidelity means
Multi-fidelity simulation assigns different execution modes to different moments. Low fidelity uses deterministic rules. Mid fidelity uses heuristics or probabilistic transitions. High fidelity uses an LLM because the decision is ambiguous, social, or consequential.
This keeps the system cheaper and more controllable. Routine behavior stays fast, while meaningful decisions still receive richer cognition.
- Low fidelity: schedules, decay, routine movement, and simple state changes.
- Mid fidelity: segment heuristics, probability weights, and known behavior priors.
- High fidelity: negotiation, objection handling, persuasion, conflict, and planning.
Time dilation without losing coherence
Time dilation lets teams accelerate or slow down simulated time with a speed factor. The risk is that agents may skip meaningful context or make decisions that do not match their memories.
A reliable design uses checkpoints. The system compresses quiet periods, summarizes what changed, and only expands moments where decisions, interactions, or surprises occur.
A practical routing policy
A useful routing policy starts with importance. Each event receives a score based on novelty, emotional weight, strategic relevance, and whether it changes the agent's goals or relationships.
When the score is low, rules handle it. When the score is medium, heuristics update state. When the score is high, the LLM sees the relevant memories, current world state, and action constraints before planning.
- Score events by novelty, relevance, emotional intensity, and risk.
- Compress low-signal intervals into memory summaries.
- Call the LLM only when a decision changes future behavior.
- Log routing decisions so the simulation remains debuggable.