Skip to main content

References

The full, annotated bibliography behind How the Form Is Presented to the LLM and Confirmation, Profiles & Agent Architecture. Grouped by the design decision each source supports. Anchors are linked from the rationale pages.

How to read this

Each entry notes its source typepeer-reviewed, preprint, vendor/engineering blog, or practitioner — so future readers know how much weight to give it. The strongest pillar (positional attention bias) is peer-reviewed; the flat-vs-nested and token-efficiency claims rest on practitioner benchmarks and vendor-reported figures.


Lost in the middle / positional attention bias

Supports decision 2 — splitting static schema from a small, fresh per-turn state.

  • Liu, N. F., Lin, K., Hewitt, J., Paranjape, A., Bevilacqua, M., Petroni, F., & Liang, P. (2023). Lost in the Middle: How Language Models Use Long Contexts. TACL.Peer-reviewed (Transactions of the ACL). The canonical finding: model accuracy is highest when relevant information is at the start or end of the context and degrades significantly in the middle, even for long-context models.
  • Found in the Middle: Calibrating Positional Attention Bias Improves Long Context Utilization (2024). arXiv 2406.16008.Preprint. Attributes the effect to a U-shaped positional-attention bias (a RoPE side-effect) and proposes a calibration fix.
  • "Lost in the Middle LLM: The U-Shaped Attention Problem Explained" — Morph.Engineering blog. Accessible explanation of the U-shaped bias and its RoPE root cause.

Flat vs. nested structure for LLM comprehension

Supports decision 1 — flattening the YAML tree into a flat field list with show_when conditions.

BAML — schema-aligned parsing & token efficiency

Supports decisions 1, 4 — the compact DSL line format instead of JSON Schema, and flat structure.

Just-in-time / lazy context loading

Supports decision 3 (load field detail on demand via getFieldInfo) and decision 5 (constrain the decision space).

  • "Introducing advanced tool use on the Claude Developer Platform" — Anthropic.Vendor/engineering (primary). Tool Search Tool and defer_loading: load only the tools/definitions relevant to the current step instead of everything upfront, to save context. Directly parallels our on-demand getFieldInfo.
  • "Tool use with Claude" — Anthropic API docs.Vendor docs. Reference for tool-calling mechanics and best practices (reduce ambiguity, examples for complex structures).

Explicit vs. implicit confirmation

Supports the confirmation decision — choosing confirmation style by stakes; verified (explicit) vs. express (implicit) profiles.

Submit/reset metaphor in conversational interfaces

Supports making commit a deterministic, in-turn event rather than an LLM "press submit" decision or a timeout job.

  • "Bridging UI Design and Chatbot Interactions: Applying Form-Based Principles to Conversational Agents" (2025). arXiv 2507.01862.Preprint. Notes that GUIs give the backend explicit Submit/Reset signals, while chat lacks them — context can shift ambiguously when the user changes subject without a clear prompt.

Playbooks, generative vs. deterministic, hybrid agents

Supports the "named playbook/profile as unit of behavior" decision and the declarative/deterministic/generative layering.

Rasa CALM — flows, process-calling, "LLM interprets, logic decides"

Supports the three-layer separation (declarative / deterministic / generative) and reusable repair patterns.

Constraint decay & declarative vs. imperative agent layers

Supports the "named profiles, not boolean flags" decision — avoiding combinatorial config fragility.

  • "Constraint decay: The Fragility of LLM Agents in Backend Code Generation." arXiv 2605.06445.Preprint. As the density of non-functional constraints rises, agent performance declines — the named fragility behind config/flag accumulation.
  • "Towards a Declarative Agentic Layer for Intelligent Agents in MCP-Based Server Ecosystems." arXiv 2601.17435.Preprint. Declarative behavior specs vs. fragile imperative workflows; reports large dev-time reductions and deployment-velocity gains, and constraining behavior to a verifiable operational space.
  • "A Declarative Language for Building And Orchestrating LLM-Powered Agent Workflows." arXiv 2512.19769.Preprint. DSL for agent workflows expressed in far fewer lines than imperative code.
  • "Formally Specifying the High-Level Behavior of LLM-Based Agents." arXiv 2310.08535.Preprint. High-level declarative specification of agent behavior, decoupled from enforcement.
  • "LLM Agents as Catalysts for Resilient DFT: An Orchestration-Based Framework Beyond Brittle Scripts." Applied Sciences (MDPI).Peer-reviewed. Orchestration-based agent framework as an alternative to brittle imperative scripts.

Internal design records

The first-party discussion that produced these decisions lives in the booker4j repo under .cursor/docs/:

DocumentCovers
tool-call-for-form-filling.mdMaster design doc; flat-schema rationale (line 294), schema vs. state split, getFieldInfo lazy loading, system-prompt layering
form-yml-documentation/phase3-traversal-implementation-discussion.mdStack-based traversal, exclusive branch selection, predictable progress
form-yml-documentation/n-level-nesting.mdMemory/perf characteristics of stack-based state
implementation-before-form-traversal.mdRedis-serializable state model constraint
llm-flow-response-plan.mdLLM-generated replies vs. hardcoded templates
rules-plan/prompt-writing-rules.mdTranslation-first, localization rules