r/pallas

Go to file

Robert Helewka 8a5046fef0 feat(pallas): stream mid-turn assistant chunks over MCP

Add `AssistantChunkEmitter` that hooks into fast-agent's
`ToolRunnerHooks.after_llm_call` to emit one
`notifications/message` per LLM iteration, carrying structured
content blocks as JSON via the existing StreamableHTTP transport.

This exposes intermediate assistant messages (substantive replies
produced before tool calls) that would otherwise be hidden inside
fast-agent's message_history and never cross the MCP boundary,
letting Daedalus update its live bubble during multi-iteration
tool loops instead of only seeing the final wrap-up text.

2026-05-28 06:09:03 -04:00

docs

docs(pallas): document sampling parameters and Prometheus metrics

2026-05-23 07:49:21 -04:00

pallas

feat(pallas): stream mid-turn assistant chunks over MCP

2026-05-28 06:09:03 -04:00

tests

docs(pallas): expand LLM preflight docs and refactor health probes

2026-05-12 15:04:57 -04:00

.gitignore

Initial commit: pallas package extracted from mentor

2026-04-02 12:41:53 +00:00

pyproject.toml

feat: add per-agent loop safeguards for tool-call turns

2026-05-27 05:41:08 -04:00

README.md

feat: add per-agent loop safeguards for tool-call turns

2026-05-27 05:41:08 -04:00

uv.lock

docs(pallas): expand LLM preflight docs and refactor health probes

2026-05-12 15:04:57 -04:00

README.md

Pallas — FastAgent MCP Bridge

Pallas is the generic runtime that turns fast-agent agent definitions into StreamableHTTP MCP servers.

It is completely deployment-agnostic: all environment-specific values (agent names, ports, hosts, model) live in the calling project's agents.yaml and fastagent.config.yaml.

Installation

pip install git+ssh://git@git.helu.ca:22022/r/pallas.git

Or as a project dependency in pyproject.toml:

dependencies = [
    "pallas-mcp @ git+ssh://git@git.helu.ca:22022/r/pallas.git",
]

Usage

Pallas reads configuration from the working directory at runtime.

my-project/
├── agents/
│   ├── __init__.py
│   └── jarvis.py          # FastAgent definitions
├── agents.yaml            # Deployment topology
├── fastagent.config.yaml  # FastAgent + model config
└── fastagent.secrets.yaml # API keys (gitignored)

Run from your project root:

pallas                     # start all agents + registry
pallas --agent jarvis      # start a single agent

Or via python -m:

python -m pallas.server

`agents.yaml` format

name: my-project           # used in log prefixes and registry names
version: "1.0.0"
host: my-host.example.com  # hostname for registry URLs
namespace: com.example.my-project
registry_port: 8200

agents:
  jarvis:
    module: agents.jarvis  # importable Python module path
    port: 8201
    title: Jarvis
    description: "My assistant agent"
    depends_on: [research]  # optional: start these first

  research:
    module: agents.research
    port: 8250
    title: Research Agent
    description: "Web search and knowledge graph"

Loop safeguards

Three optional fields bound how long an agent's tool-call loop can run:

Field	Type	Default	Purpose
`max_iterations`	int	15	Maximum tool calls in a single agent turn
`streaming_timeout`	float	120	Max idle seconds between streaming events
`turn_timeout`	float	300	Hard wall-clock limit for a full turn (seconds)

All three are optional. Agents that omit them use the defaults shown above.

agents:
  research:
    module: agents.research
    port: 8250
    max_iterations: 10      # this agent only needs a few search calls
    streaming_timeout: 60   # fail fast on a slow search MCP
    turn_timeout: 120       # research turns should not take more than 2 min

`fastagent.config.yaml` extensions

Pallas reads two extra keys beyond the standard fast-agent config:

default_model: openai.my-custom-model-name

# Explicit capability declarations — avoids brittle name-regex heuristics
model_capabilities:
  vision: false
  context_window: 200000
  max_output_tokens: 32000

Capabilities are published in the registry and used to register unknown models with fast-agent's ModelDatabase.

AWS Bedrock Mantle — automatic shims

When anthropic.base_url points at a Bedrock Mantle endpoint (https://bedrock-mantle.{region}.api.aws/anthropic), Pallas auto-detects it at startup and installs two compatibility shims via pallas.mantle_shims. No config flag is required.

Shim 1 — wire-name prefix. Mantle requires the full anthropic.<name> wire id (e.g. anthropic.claude-opus-4-7). Fast-agent's model-spec parser would otherwise strip the anthropic. prefix, causing a misleading 404 "The model '...' does not exist". The shim registers the prefixed forms in ModelDatabase._PROVIDER_WIRE_MODEL_NAMES.

Shim 2 — strip caller: null from replayed tool_use blocks. Anthropic SDK 0.100.x leaks caller: null onto serialised BetaToolUseBlock params (upstream issue #1454). api.anthropic.com silently tolerates the extra field; Mantle rejects it with tool_use.caller: Input should be a valid dictionary or object, which breaks the MCP tool-use loop on the second turn. The shim monkeypatches AnthropicConverter._deserialize_assistant_raw_blocks and _append_server_tool_channel_blocks to pop the field before history is re-sent.

See docs/bedrock.md for the full configuration walkthrough.

Environment variable

Variable	Default	Purpose
`PALLAS_AGENTS_CONFIG`	`agents.yaml`	Override path to deployment config

What Pallas provides

Module	Purpose
`pallas.server`	CLI entry point and agent orchestration
`pallas.registry`	`GET /.well-known/mcp/server.json` registry server
`pallas.multimodal_server`	`MultimodalAgentMCPServer` — `AgentMCPServer` subclass with image + history support
`pallas.health`	LLM preflight validation + `get_health` MCP tool
`pallas._fastagent_patch`	Traceback-capture wrappers around three opaque fast-agent catch-sites (debug-only)

Authentication

Pallas is transparent to downstream authentication. Whatever the operator places under each downstream MCP server's headers: block in fastagent.config.yaml (typically loaded from fastagent.secrets.yaml) is what fast-agent sends — Pallas does not intercept, rewrite, or forward the inbound Authorization header of the MCP request that triggered the agent turn.

For agents that talk to Mnemosyne, the convention is a long-lived team JWT minted from Mnemosyne's admin UI and pasted into the agent project's fastagent.secrets.yaml:

mcp:
  servers:
    mnemosyne:
      transport: http
      url: https://mnemosyne.example.com/mcp/
      headers:
        Authorization: "Bearer eyJ…team-jwt…"

See mnemosyne/docs/DAEDALUS_PALLAS_INTEGRATION_v1.md for the three credential types Mnemosyne recognises, how team JWTs are minted and rotated, and the data model that ties a team to a set of libraries.

Earlier versions of Pallas shipped a forward_inbound_auth: true mechanism that captured the per-turn Authorization header and propagated it to opted-in downstream servers. That mechanism has been retired — opt-in flags in old fastagent.config.yaml files are now silently ignored and can be removed at your convenience.

README.md

Pallas — FastAgent MCP Bridge

Installation

Usage

agents.yaml format

Loop safeguards

fastagent.config.yaml extensions

AWS Bedrock Mantle — automatic shims

Environment variable

What Pallas provides

Authentication

`agents.yaml` format

`fastagent.config.yaml` extensions