Add two new sections to the Pallas documentation: - Sampling parameters: explain that temperature/top_p/top_k are configured via the fast-agent decorator's `request_params`, with a provider support matrix and a note on Claude Opus 4.7 stripping these params in favor of `output_config.effort`. - Metrics: document the Prometheus `/metrics` endpoint exposed on the registry port, including scrape config, full metrics reference table, and notes on where each metric is captured.
Pallas — FastAgent MCP Bridge
Pallas is the generic runtime that turns fast-agent agent definitions into StreamableHTTP MCP servers.
It is completely deployment-agnostic: all environment-specific values (agent names, ports, hosts, model) live in the calling project's agents.yaml and fastagent.config.yaml.
Installation
pip install git+ssh://git@git.helu.ca:22022/r/pallas.git
Or as a project dependency in pyproject.toml:
dependencies = [
"pallas-mcp @ git+ssh://git@git.helu.ca:22022/r/pallas.git",
]
Usage
Pallas reads configuration from the working directory at runtime.
my-project/
├── agents/
│ ├── __init__.py
│ └── jarvis.py # FastAgent definitions
├── agents.yaml # Deployment topology
├── fastagent.config.yaml # FastAgent + model config
└── fastagent.secrets.yaml # API keys (gitignored)
Run from your project root:
pallas # start all agents + registry
pallas --agent jarvis # start a single agent
Or via python -m:
python -m pallas.server
agents.yaml format
name: my-project # used in log prefixes and registry names
version: "1.0.0"
host: my-host.example.com # hostname for registry URLs
namespace: com.example.my-project
registry_port: 8200
agents:
jarvis:
module: agents.jarvis # importable Python module path
port: 8201
title: Jarvis
description: "My assistant agent"
depends_on: [research] # optional: start these first
research:
module: agents.research
port: 8250
title: Research Agent
description: "Web search and knowledge graph"
fastagent.config.yaml extensions
Pallas reads two extra keys beyond the standard fast-agent config:
default_model: openai.my-custom-model-name
# Explicit capability declarations — avoids brittle name-regex heuristics
model_capabilities:
vision: false
context_window: 200000
max_output_tokens: 32000
Capabilities are published in the registry and used to register unknown models
with fast-agent's ModelDatabase.
AWS Bedrock Mantle — automatic shims
When anthropic.base_url points at a Bedrock Mantle endpoint
(https://bedrock-mantle.{region}.api.aws/anthropic), Pallas auto-detects it
at startup and installs two compatibility shims via pallas.mantle_shims.
No config flag is required.
Shim 1 — wire-name prefix. Mantle requires the full anthropic.<name>
wire id (e.g. anthropic.claude-opus-4-7). Fast-agent's model-spec parser
would otherwise strip the anthropic. prefix, causing a misleading
404 "The model '...' does not exist". The shim registers the prefixed
forms in ModelDatabase._PROVIDER_WIRE_MODEL_NAMES.
Shim 2 — strip caller: null from replayed tool_use blocks. Anthropic
SDK 0.100.x leaks caller: null onto serialised BetaToolUseBlock params
(upstream issue #1454).
api.anthropic.com silently tolerates the extra field; Mantle rejects it
with tool_use.caller: Input should be a valid dictionary or object, which
breaks the MCP tool-use loop on the second turn. The shim monkeypatches
AnthropicConverter._deserialize_assistant_raw_blocks and
_append_server_tool_channel_blocks to pop the field before history is
re-sent.
See docs/bedrock.md for the full configuration walkthrough.
Environment variable
| Variable | Default | Purpose |
|---|---|---|
PALLAS_AGENTS_CONFIG |
agents.yaml |
Override path to deployment config |
What Pallas provides
| Module | Purpose |
|---|---|
pallas.server |
CLI entry point and agent orchestration |
pallas.registry |
GET /.well-known/mcp/server.json registry server |
pallas.multimodal_server |
MultimodalAgentMCPServer — AgentMCPServer subclass with image + history support |
pallas.health |
LLM preflight validation + get_health MCP tool |
pallas._fastagent_patch |
Traceback-capture wrappers around three opaque fast-agent catch-sites (debug-only) |
Authentication
Pallas is transparent to downstream authentication. Whatever the operator
places under each downstream MCP server's headers: block in
fastagent.config.yaml (typically loaded from fastagent.secrets.yaml) is what
fast-agent sends — Pallas does not intercept, rewrite, or forward the inbound
Authorization header of the MCP request that triggered the agent turn.
For agents that talk to Mnemosyne, the convention is a long-lived team JWT
minted from Mnemosyne's admin UI and pasted into the agent project's
fastagent.secrets.yaml:
mcp:
servers:
mnemosyne:
transport: http
url: https://mnemosyne.example.com/mcp/
headers:
Authorization: "Bearer eyJ…team-jwt…"
See
mnemosyne/docs/DAEDALUS_PALLAS_INTEGRATION_v1.md
for the three credential types Mnemosyne recognises, how team JWTs are
minted and rotated, and the data model that ties a team to a set of
libraries.
Earlier versions of Pallas shipped a
forward_inbound_auth: truemechanism that captured the per-turnAuthorizationheader and propagated it to opted-in downstream servers. That mechanism has been retired — opt-in flags in oldfastagent.config.yamlfiles are now silently ignored and can be removed at your convenience.