Skip to Content
Perstack 0.0.1 is released 🎉
Get StartedExpert Stack

Expert Stack

Expert Stack is the architecture that brings order to AI app development chaos. It transforms unpredictable prompt chains into declarative, deterministic, composable software components, such as modern software development.

Overview

Expert Stack has 3 main layers:

  • Expert Definition: Declarative expert configuration, published to the registry
  • Runtime: Orchestrates expert execution, managing LLM calls, MCP servers, and access to the workspace
  • Run State: The state of the run, to make it reproducible and observable

Expert Definition

All experts are described in natural language and defined in serializable schema.

There are two ways to define and retrieve definition of an expert:

  • From perstack.toml
  • From Registry

You can define the expert in:

  • Version: Setting the static version of the expert.
  • Instruction: Telling the expert what to do, and how to do it.
  • Skills: Granting capabilities to the expert with specific rules and best practices.
  • Delegates: Granting the ability to delegate to other experts from the registry.

Example Expert Definition in perstack.toml

[experts."market-analyzer"] version = "1.0.0" description = "Analyzes market trends and competitive landscapes" instruction = """ You are a market analysis expert. Your role is to: 1. Gather data from multiple sources 2. Identify trends and patterns 3. Provide actionable insights 4. Support conclusions with data 5. Write a report in the workspace """ # Connect to external APIs via MCP [experts."market-analyzer".skills."custom-data-api"] type = "mcpStdioSkill" command = "npx" packageName = "custom-data-api-server" requiredEnv = ["CUSTOM_API_KEY"] rules = """ - You must retrieve data one-by-one from the custom-data-api skill. - You must not use the custom-data-api skill more than 3 times. """ # Delegate to specialized experts from the registry delegates = ["@perstack/deep-research"]

Learn more about the perstack.toml file.

Runtime

The runtime is the core of the Expert Stack. It is responsible for:

  • Managing the reasoning loop.
  • Constructing prompts from the expert definition and the run state.
  • Reasoning with LLMs.
  • Managing skills and calling MCP servers.
  • Delegating to other experts.
  • Accessing the workspace securely.
  • Emitting events for observability.

Reasoning Loop

The runtime executes experts through a series of discrete steps. Each step represents a complete reasoning cycle where the expert:

  1. Analyzes the current context and messages
  2. Decides on an action (tool call, delegation, or completion)
  3. Executes the action and processes the result

The execution state is tracked through Checkpoint.status:

  • init: Initial state before execution begins
  • proceeding: Actively executing steps
  • completed: Successfully finished the task
  • stoppedByInteractiveTool: Paused waiting for user confirmation
  • stoppedByDelegate: Paused while another expert completes a delegated task
  • stoppedByExceededMaxSteps: Halted due to reaching the step limit
  • stoppedByError: Stopped due to an unrecoverable error

Each step follows a deterministic pattern: prompt generation → LLM reasoning → tool execution → result processing. This predictable flow enables reliable execution and easy debugging.

Skills

Skills provide experts with capabilities to interact with external systems and perform actions. There are three types of skills:

MCP Skills

  • Connect to Model Context Protocol (MCP) servers
  • Access external APIs, databases, and services
  • Executed through standardized tool interfaces

Interactive Skills

  • Require user confirmation before execution
  • Used for sensitive operations like file modifications or deployments
  • Pause execution until user approval is received

Delegate Skills

  • Enable experts to delegate tasks to other specialized experts
  • Support hierarchical task decomposition
  • Allow complex workflows through expert composition

Each skill can define rules that constrain its usage, ensuring safe and predictable behavior. The runtime manages skill lifecycle, connection pooling, and error handling automatically.

Hierarchical Delegation

Perstack experts can delegate tasks to other experts. This is how you build complex workflows.

In perstack.toml, you can define the delegates for an expert.

[experts."market-analyzer"] delegates = ["@perstack/deep-research"] # @perstack/deep-research delegates to @perstack/website-analyzer

Then, the expert will delegate the task to the delegates, and then aggregate the results.

Workspace

The workspace provides a secure, isolated environment for expert execution with two key properties:

Isolation

  • Experts can only access files within their designated workspace directory
  • File system permissions are strictly enforced
  • Prevents unauthorized access to system files or other workspaces
  • Each run operates in its own sandboxed environment

Persistence

  • Run state is automatically saved to [workspace]/perstack/runs/
  • Serves as shared knowledge storage between experts
  • Enables delegated experts to operate independently without sharing context windows
  • Supports long-running workflows with durable state

This design allows experts to collaborate through the file system while maintaining security boundaries. Delegated experts can access shared workspace files to understand context without requiring the full conversation history, reducing token usage and improving performance.

Observability

The runtime provides comprehensive observability through an event-driven architecture:

  • Event Emission: Every state transition and action generates detailed events
  • Custom Listeners: Implement custom event handlers for monitoring and integration
  • Default Storage: Events are automatically saved to the workspace for debugging

Event listeners enable real-time monitoring, custom logging, and integration with external observability platforms. The event stream provides a complete audit trail of expert execution.

Error Recovery

The runtime includes robust error handling mechanisms:

  • Configurable Retry Policies: Set maxRetries to automatically retry failed operations
  • Graceful Degradation: Partial failures (like individual tool errors) don’t stop execution
  • Checkpoint-based Recovery: Resume from any saved checkpoint after failures
  • Error Status Tracking: Failed runs preserve their state with stoppedByError status

When errors occur, the runtime captures the full context in checkpoints, allowing you to debug, fix issues, and resume execution without losing progress. Tool-level errors are handled gracefully, with error messages returned as tool results rather than crashing the entire run.

Run State

Perstack uses a checkpoint-based approach to manage the state of the run.

The run state consists of:

  • Event: A record of an action taken by the runtime.
  • Step: A logical unit of agent-loop steps.
  • Checkpoint: A snapshot of the run state at the end of a step. It is used to resume or branch the run from a specific point in time.

All events and checkpoints are stored in the workspace ([workspace]/perstack/runs) by default. You can also store them in any other storage, such as a database by implementing the EventListener interface.

import { run, type RunEvent } from "@perstack/runtime" const result = await run({ setting, eventListener: (event: RunEvent) => { console.log(event) }, })

Events

Events are discrete occurrences within a step that drives execution forward. All transitions between execution phases are driven by events.

Key examples of events include:

  • startRun: Initializes the run.
  • startGeneration: Begins a step.
  • callTool, callDelegate: Triggers external execution.
  • resolveToolResult: Captures external response.
  • finishToolCall: Completes the current step.
  • continueToNextStep: Moves to the next step.

Events are emitted and subscribed to via the runtime’s event system (RunEventEmitter). They provide fine-grained observability into what happened during execution, and when.

While a step is a segment of execution, events are the points plotted along that segment—recording exactly what occurred and why.

Steps

Steps represent logical unit of action within an agent-loop. Each step involves a full reasoning cycle, which may include:

  • Prompting the LLM for a decision or output
  • Calling an external tool via MCP
  • Delegating to another Expert and handling its result

Steps are assigned a stepNumber and record:

type Step = { stepNumber: number; newMessages: Message[]; toolCall?: ToolCall; toolResult?: ToolResult; usage: Usage; startedAt: number; finishedAt: number; }

Steps are sequential and isolated. The runtime executes them one after another, updating state and usage metrics along the way. Conceptually, the expert loop can be visualized as a straight line, with each step being a distinct segment.

Checkpoints

Checkpoints are durable, structured snapshots of an Expert’s execution state taken at the end of each step. They serve as the sole mechanism for:

  • Resuming a run
  • Forking from a previous state
  • Persisting historical execution context

Checkpoint data includes:

type Checkpoint = { id: string; runId: string; expert: { key, name, version }; stepNumber: number; messages: Message[]; usage: Usage; status: "init" | "proceeding" | "completed" | ...; delegatedBy?: { ... }; }

Because every checkpoint is adjacent to the next step’s beginning, the expert loop can be precisely resumed or branched from any previous point. This enables:

# Resume from latest npx perstack run <expertKey> <query> --continue-run <runId>
# Resume from specific checkpoint npx perstack run <expertKey> <query> --continue-run <runId> --resume-from <checkpointId>

Checkpoints are stored as lightweight JSON objects. They are serialized on disk (or in external stores) and can be used to reconstruct full state and context for inspection or reruns.