Introduce three optional per-agent config fields to bound tool-call loop execution: `max_iterations` (default 15), `streaming_timeout` (default 120s), and `turn_timeout` (default 300s wall-clock). - Plumb limits from agent config through `_build_agents_table` and `_start_agent` into `MultimodalAgentMCPServer` via `request_limits` - Apply `max_iterations` and `streaming_timeout` to `RequestParams` - Wrap turn dispatch in `asyncio.wait_for` to enforce `turn_timeout`, logging a warning on timeout - Document the new fields in README
201 lines
6.2 KiB
Markdown
201 lines
6.2 KiB
Markdown
# Pallas — FastAgent MCP Bridge
|
|
|
|
Pallas is the generic runtime that turns [fast-agent](https://github.com/evalstate/fast-agent) agent definitions into StreamableHTTP MCP servers.
|
|
|
|
It is **completely deployment-agnostic**: all environment-specific values (agent names, ports, hosts, model) live in the calling project's `agents.yaml` and `fastagent.config.yaml`.
|
|
|
|
---
|
|
|
|
## Installation
|
|
|
|
```bash
|
|
pip install git+ssh://git@git.helu.ca:22022/r/pallas.git
|
|
```
|
|
|
|
Or as a project dependency in `pyproject.toml`:
|
|
|
|
```toml
|
|
dependencies = [
|
|
"pallas-mcp @ git+ssh://git@git.helu.ca:22022/r/pallas.git",
|
|
]
|
|
```
|
|
|
|
---
|
|
|
|
## Usage
|
|
|
|
Pallas reads configuration from the **working directory** at runtime.
|
|
|
|
```
|
|
my-project/
|
|
├── agents/
|
|
│ ├── __init__.py
|
|
│ └── jarvis.py # FastAgent definitions
|
|
├── agents.yaml # Deployment topology
|
|
├── fastagent.config.yaml # FastAgent + model config
|
|
└── fastagent.secrets.yaml # API keys (gitignored)
|
|
```
|
|
|
|
Run from your project root:
|
|
|
|
```bash
|
|
pallas # start all agents + registry
|
|
pallas --agent jarvis # start a single agent
|
|
```
|
|
|
|
Or via `python -m`:
|
|
|
|
```bash
|
|
python -m pallas.server
|
|
```
|
|
|
|
---
|
|
|
|
## `agents.yaml` format
|
|
|
|
```yaml
|
|
name: my-project # used in log prefixes and registry names
|
|
version: "1.0.0"
|
|
host: my-host.example.com # hostname for registry URLs
|
|
namespace: com.example.my-project
|
|
registry_port: 8200
|
|
|
|
agents:
|
|
jarvis:
|
|
module: agents.jarvis # importable Python module path
|
|
port: 8201
|
|
title: Jarvis
|
|
description: "My assistant agent"
|
|
depends_on: [research] # optional: start these first
|
|
|
|
research:
|
|
module: agents.research
|
|
port: 8250
|
|
title: Research Agent
|
|
description: "Web search and knowledge graph"
|
|
```
|
|
|
|
### Loop safeguards
|
|
|
|
Three optional fields bound how long an agent's tool-call loop can run:
|
|
|
|
| Field | Type | Default | Purpose |
|
|
|---|---|---|---|
|
|
| `max_iterations` | int | 15 | Maximum tool calls in a single agent turn |
|
|
| `streaming_timeout` | float | 120 | Max idle seconds between streaming events |
|
|
| `turn_timeout` | float | 300 | Hard wall-clock limit for a full turn (seconds) |
|
|
|
|
All three are optional. Agents that omit them use the defaults shown above.
|
|
|
|
```yaml
|
|
agents:
|
|
research:
|
|
module: agents.research
|
|
port: 8250
|
|
max_iterations: 10 # this agent only needs a few search calls
|
|
streaming_timeout: 60 # fail fast on a slow search MCP
|
|
turn_timeout: 120 # research turns should not take more than 2 min
|
|
```
|
|
|
|
---
|
|
|
|
## `fastagent.config.yaml` extensions
|
|
|
|
Pallas reads two extra keys beyond the standard fast-agent config:
|
|
|
|
```yaml
|
|
default_model: openai.my-custom-model-name
|
|
|
|
# Explicit capability declarations — avoids brittle name-regex heuristics
|
|
model_capabilities:
|
|
vision: false
|
|
context_window: 200000
|
|
max_output_tokens: 32000
|
|
```
|
|
|
|
Capabilities are published in the registry and used to register unknown models
|
|
with fast-agent's `ModelDatabase`.
|
|
|
|
### AWS Bedrock Mantle — automatic shims
|
|
|
|
When `anthropic.base_url` points at a Bedrock Mantle endpoint
|
|
(`https://bedrock-mantle.{region}.api.aws/anthropic`), Pallas auto-detects it
|
|
at startup and installs two compatibility shims via `pallas.mantle_shims`.
|
|
No config flag is required.
|
|
|
|
**Shim 1 — wire-name prefix.** Mantle requires the full `anthropic.<name>`
|
|
wire id (e.g. `anthropic.claude-opus-4-7`). Fast-agent's model-spec parser
|
|
would otherwise strip the `anthropic.` prefix, causing a misleading
|
|
`404 "The model '...' does not exist"`. The shim registers the prefixed
|
|
forms in `ModelDatabase._PROVIDER_WIRE_MODEL_NAMES`.
|
|
|
|
**Shim 2 — strip `caller: null` from replayed `tool_use` blocks.** Anthropic
|
|
SDK 0.100.x leaks `caller: null` onto serialised `BetaToolUseBlock` params
|
|
([upstream issue #1454](https://github.com/anthropics/anthropic-sdk-python/issues/1454)).
|
|
`api.anthropic.com` silently tolerates the extra field; Mantle rejects it
|
|
with `tool_use.caller: Input should be a valid dictionary or object`, which
|
|
breaks the MCP tool-use loop on the second turn. The shim monkeypatches
|
|
`AnthropicConverter._deserialize_assistant_raw_blocks` and
|
|
`_append_server_tool_channel_blocks` to pop the field before history is
|
|
re-sent.
|
|
|
|
See `docs/bedrock.md` for the full configuration walkthrough.
|
|
|
|
|
|
---
|
|
|
|
## Environment variable
|
|
|
|
| Variable | Default | Purpose |
|
|
|---|---|---|
|
|
| `PALLAS_AGENTS_CONFIG` | `agents.yaml` | Override path to deployment config |
|
|
|
|
---
|
|
|
|
## What Pallas provides
|
|
|
|
| Module | Purpose |
|
|
|---|---|
|
|
| `pallas.server` | CLI entry point and agent orchestration |
|
|
| `pallas.registry` | `GET /.well-known/mcp/server.json` registry server |
|
|
| `pallas.multimodal_server` | `MultimodalAgentMCPServer` — `AgentMCPServer` subclass with image + history support |
|
|
| `pallas.health` | LLM preflight validation + `get_health` MCP tool |
|
|
| `pallas._fastagent_patch` | Traceback-capture wrappers around three opaque fast-agent catch-sites (debug-only) |
|
|
|
|
---
|
|
|
|
## Authentication
|
|
|
|
Pallas is **transparent** to downstream authentication. Whatever the operator
|
|
places under each downstream MCP server's `headers:` block in
|
|
`fastagent.config.yaml` (typically loaded from `fastagent.secrets.yaml`) is what
|
|
fast-agent sends — Pallas does not intercept, rewrite, or forward the inbound
|
|
`Authorization` header of the MCP request that triggered the agent turn.
|
|
|
|
For agents that talk to Mnemosyne, the convention is a long-lived team JWT
|
|
minted from Mnemosyne's admin UI and pasted into the agent project's
|
|
`fastagent.secrets.yaml`:
|
|
|
|
```yaml
|
|
mcp:
|
|
servers:
|
|
mnemosyne:
|
|
transport: http
|
|
url: https://mnemosyne.example.com/mcp/
|
|
headers:
|
|
Authorization: "Bearer eyJ…team-jwt…"
|
|
```
|
|
|
|
See
|
|
[`mnemosyne/docs/DAEDALUS_PALLAS_INTEGRATION_v1.md`](https://git.helu.ca/r/mnemosyne/src/branch/main/docs/DAEDALUS_PALLAS_INTEGRATION_v1.md)
|
|
for the three credential types Mnemosyne recognises, how team JWTs are
|
|
minted and rotated, and the data model that ties a team to a set of
|
|
libraries.
|
|
|
|
> Earlier versions of Pallas shipped a `forward_inbound_auth: true`
|
|
> mechanism that captured the per-turn `Authorization` header and
|
|
> propagated it to opted-in downstream servers. That mechanism has been
|
|
> retired — opt-in flags in old `fastagent.config.yaml` files are now
|
|
> silently ignored and can be removed at your convenience.
|
|
|