Files
pallas/README.md
Robert Helewka fe94f6a9a8 feat: add Mantle override for AWS Bedrock Anthropic endpoint
Introduce `model_capabilities.mantle` flag that installs a provider-specific
override in fast-agent's `ModelDatabase._PROVIDER_MODEL_OVERRIDES` to strip
features the AWS Bedrock Mantle endpoint rejects (beta headers, extended
thinking, task budgets, web tools, prompt caching).

Without this override, fast-agent sends default beta headers and `thinking`
parameters for modern Claude models that Mantle rejects with a misleading
404 "model does not exist" error.
2026-05-12 07:41:41 -04:00

174 lines
5.2 KiB
Markdown

# Pallas — FastAgent MCP Bridge
Pallas is the generic runtime that turns [fast-agent](https://github.com/evalstate/fast-agent) agent definitions into StreamableHTTP MCP servers.
It is **completely deployment-agnostic**: all environment-specific values (agent names, ports, hosts, model) live in the calling project's `agents.yaml` and `fastagent.config.yaml`.
---
## Installation
```bash
pip install git+ssh://git@git.helu.ca:22022/r/pallas.git
```
Or as a project dependency in `pyproject.toml`:
```toml
dependencies = [
"pallas-mcp @ git+ssh://git@git.helu.ca:22022/r/pallas.git",
]
```
---
## Usage
Pallas reads configuration from the **working directory** at runtime.
```
my-project/
├── agents/
│ ├── __init__.py
│ └── jarvis.py # FastAgent definitions
├── agents.yaml # Deployment topology
├── fastagent.config.yaml # FastAgent + model config
└── fastagent.secrets.yaml # API keys (gitignored)
```
Run from your project root:
```bash
pallas # start all agents + registry
pallas --agent jarvis # start a single agent
```
Or via `python -m`:
```bash
python -m pallas.server
```
---
## `agents.yaml` format
```yaml
name: my-project # used in log prefixes and registry names
version: "1.0.0"
host: my-host.example.com # hostname for registry URLs
namespace: com.example.my-project
registry_port: 8200
agents:
jarvis:
module: agents.jarvis # importable Python module path
port: 8201
title: Jarvis
description: "My assistant agent"
depends_on: [research] # optional: start these first
research:
module: agents.research
port: 8250
title: Research Agent
description: "Web search and knowledge graph"
```
---
## `fastagent.config.yaml` extensions
Pallas reads two extra keys beyond the standard fast-agent config:
```yaml
default_model: openai.my-custom-model-name
# Explicit capability declarations — avoids brittle name-regex heuristics
model_capabilities:
vision: false
context_window: 200000
max_output_tokens: 32000
mantle: false # optional — see "Mantle override" below
```
Capabilities are published in the registry and used to register unknown models
with fast-agent's `ModelDatabase`.
### Mantle override (`model_capabilities.mantle: true`)
Set this when the `anthropic.base_url` points at the AWS Bedrock **Mantle**
endpoint (`https://bedrock-mantle.{region}.api.aws/anthropic`). Pallas then
installs a provider-specific override for `(Provider.ANTHROPIC, model_name)`
in fast-agent's `ModelDatabase._PROVIDER_MODEL_OVERRIDES` that clones the
model's base parameters but strips the features Mantle rejects:
- `anthropic_required_betas` — no `anthropic-beta: ...` header
- `reasoning` / `reasoning_effort_spec` — no extended-thinking request
- `anthropic_task_budget_supported` — no task budget
- `anthropic_web_fetch_version` / `anthropic_web_search_version` — no web tools
- `cache_ttl` — prompt caching disabled
Without this flag, fast-agent sends its default beta headers and `thinking`
parameters for modern Claude models (e.g. Opus 4.7, Sonnet 4.6) which Mantle
rejects with a misleading `404 "The model '...' does not exist"`. See
`docs/bedrock.md` for the full configuration walkthrough.
---
## Environment variable
| Variable | Default | Purpose |
|---|---|---|
| `PALLAS_AGENTS_CONFIG` | `agents.yaml` | Override path to deployment config |
---
## What Pallas provides
| Module | Purpose |
|---|---|
| `pallas.server` | CLI entry point and agent orchestration |
| `pallas.registry` | `GET /.well-known/mcp/server.json` registry server |
| `pallas.multimodal_server` | `MultimodalAgentMCPServer``AgentMCPServer` subclass with image + history support |
| `pallas.health` | LLM preflight validation + `get_health` MCP tool |
| `pallas._fastagent_patch` | Traceback-capture wrappers around three opaque fast-agent catch-sites (debug-only) |
---
## Authentication
Pallas is **transparent** to downstream authentication. Whatever the operator
places under each downstream MCP server's `headers:` block in
`fastagent.config.yaml` (typically loaded from `fastagent.secrets.yaml`) is what
fast-agent sends — Pallas does not intercept, rewrite, or forward the inbound
`Authorization` header of the MCP request that triggered the agent turn.
For agents that talk to Mnemosyne, the convention is a long-lived team JWT
minted from Mnemosyne's admin UI and pasted into the agent project's
`fastagent.secrets.yaml`:
```yaml
mcp:
servers:
mnemosyne:
transport: http
url: https://mnemosyne.example.com/mcp/
headers:
Authorization: "Bearer eyJ…team-jwt…"
```
See
[`mnemosyne/docs/DAEDALUS_PALLAS_INTEGRATION_v1.md`](https://git.helu.ca/r/mnemosyne/src/branch/main/docs/DAEDALUS_PALLAS_INTEGRATION_v1.md)
for the three credential types Mnemosyne recognises, how team JWTs are
minted and rotated, and the data model that ties a team to a set of
libraries.
> Earlier versions of Pallas shipped a `forward_inbound_auth: true`
> mechanism that captured the per-turn `Authorization` header and
> propagated it to opted-in downstream servers. That mechanism has been
> retired — opt-in flags in old `fastagent.config.yaml` files are now
> silently ignored and can be removed at your convenience.