Robert Helewka 8a5046fef0 feat(pallas): stream mid-turn assistant chunks over MCP
Add `AssistantChunkEmitter` that hooks into fast-agent's
`ToolRunnerHooks.after_llm_call` to emit one
`notifications/message` per LLM iteration, carrying structured
content blocks as JSON via the existing StreamableHTTP transport.

This exposes intermediate assistant messages (substantive replies
produced before tool calls) that would otherwise be hidden inside
fast-agent's message_history and never cross the MCP boundary,
letting Daedalus update its live bubble during multi-iteration
tool loops instead of only seeing the final wrap-up text.
2026-05-28 06:09:03 -04:00

Pallas — FastAgent MCP Bridge

Pallas is the generic runtime that turns fast-agent agent definitions into StreamableHTTP MCP servers.

It is completely deployment-agnostic: all environment-specific values (agent names, ports, hosts, model) live in the calling project's agents.yaml and fastagent.config.yaml.


Installation

pip install git+ssh://git@git.helu.ca:22022/r/pallas.git

Or as a project dependency in pyproject.toml:

dependencies = [
    "pallas-mcp @ git+ssh://git@git.helu.ca:22022/r/pallas.git",
]

Usage

Pallas reads configuration from the working directory at runtime.

my-project/
├── agents/
│   ├── __init__.py
│   └── jarvis.py          # FastAgent definitions
├── agents.yaml            # Deployment topology
├── fastagent.config.yaml  # FastAgent + model config
└── fastagent.secrets.yaml # API keys (gitignored)

Run from your project root:

pallas                     # start all agents + registry
pallas --agent jarvis      # start a single agent

Or via python -m:

python -m pallas.server

agents.yaml format

name: my-project           # used in log prefixes and registry names
version: "1.0.0"
host: my-host.example.com  # hostname for registry URLs
namespace: com.example.my-project
registry_port: 8200

agents:
  jarvis:
    module: agents.jarvis  # importable Python module path
    port: 8201
    title: Jarvis
    description: "My assistant agent"
    depends_on: [research]  # optional: start these first

  research:
    module: agents.research
    port: 8250
    title: Research Agent
    description: "Web search and knowledge graph"

Loop safeguards

Three optional fields bound how long an agent's tool-call loop can run:

Field Type Default Purpose
max_iterations int 15 Maximum tool calls in a single agent turn
streaming_timeout float 120 Max idle seconds between streaming events
turn_timeout float 300 Hard wall-clock limit for a full turn (seconds)

All three are optional. Agents that omit them use the defaults shown above.

agents:
  research:
    module: agents.research
    port: 8250
    max_iterations: 10      # this agent only needs a few search calls
    streaming_timeout: 60   # fail fast on a slow search MCP
    turn_timeout: 120       # research turns should not take more than 2 min

fastagent.config.yaml extensions

Pallas reads two extra keys beyond the standard fast-agent config:

default_model: openai.my-custom-model-name

# Explicit capability declarations — avoids brittle name-regex heuristics
model_capabilities:
  vision: false
  context_window: 200000
  max_output_tokens: 32000

Capabilities are published in the registry and used to register unknown models with fast-agent's ModelDatabase.

AWS Bedrock Mantle — automatic shims

When anthropic.base_url points at a Bedrock Mantle endpoint (https://bedrock-mantle.{region}.api.aws/anthropic), Pallas auto-detects it at startup and installs two compatibility shims via pallas.mantle_shims. No config flag is required.

Shim 1 — wire-name prefix. Mantle requires the full anthropic.<name> wire id (e.g. anthropic.claude-opus-4-7). Fast-agent's model-spec parser would otherwise strip the anthropic. prefix, causing a misleading 404 "The model '...' does not exist". The shim registers the prefixed forms in ModelDatabase._PROVIDER_WIRE_MODEL_NAMES.

Shim 2 — strip caller: null from replayed tool_use blocks. Anthropic SDK 0.100.x leaks caller: null onto serialised BetaToolUseBlock params (upstream issue #1454). api.anthropic.com silently tolerates the extra field; Mantle rejects it with tool_use.caller: Input should be a valid dictionary or object, which breaks the MCP tool-use loop on the second turn. The shim monkeypatches AnthropicConverter._deserialize_assistant_raw_blocks and _append_server_tool_channel_blocks to pop the field before history is re-sent.

See docs/bedrock.md for the full configuration walkthrough.


Environment variable

Variable Default Purpose
PALLAS_AGENTS_CONFIG agents.yaml Override path to deployment config

What Pallas provides

Module Purpose
pallas.server CLI entry point and agent orchestration
pallas.registry GET /.well-known/mcp/server.json registry server
pallas.multimodal_server MultimodalAgentMCPServerAgentMCPServer subclass with image + history support
pallas.health LLM preflight validation + get_health MCP tool
pallas._fastagent_patch Traceback-capture wrappers around three opaque fast-agent catch-sites (debug-only)

Authentication

Pallas is transparent to downstream authentication. Whatever the operator places under each downstream MCP server's headers: block in fastagent.config.yaml (typically loaded from fastagent.secrets.yaml) is what fast-agent sends — Pallas does not intercept, rewrite, or forward the inbound Authorization header of the MCP request that triggered the agent turn.

For agents that talk to Mnemosyne, the convention is a long-lived team JWT minted from Mnemosyne's admin UI and pasted into the agent project's fastagent.secrets.yaml:

mcp:
  servers:
    mnemosyne:
      transport: http
      url: https://mnemosyne.example.com/mcp/
      headers:
        Authorization: "Bearer eyJ…team-jwt…"

See mnemosyne/docs/DAEDALUS_PALLAS_INTEGRATION_v1.md for the three credential types Mnemosyne recognises, how team JWTs are minted and rotated, and the data model that ties a team to a set of libraries.

Earlier versions of Pallas shipped a forward_inbound_auth: true mechanism that captured the per-turn Authorization header and propagated it to opted-in downstream servers. That mechanism has been retired — opt-in flags in old fastagent.config.yaml files are now silently ignored and can be removed at your convenience.

Description
FastAgent MCP Bridge — generic runtime for serving FastAgent agents over StreamableHTTP
Readme 810 KiB
Languages
Python 100%