feat: add Mantle override for AWS Bedrock Anthropic endpoint
Introduce `model_capabilities.mantle` flag that installs a provider-specific override in fast-agent's `ModelDatabase._PROVIDER_MODEL_OVERRIDES` to strip features the AWS Bedrock Mantle endpoint rejects (beta headers, extended thinking, task budgets, web tools, prompt caching). Without this override, fast-agent sends default beta headers and `thinking` parameters for modern Claude models that Mantle rejects with a misleading 404 "model does not exist" error.
This commit is contained in:
392
docs/bedrock.md
Normal file
392
docs/bedrock.md
Normal file
@@ -0,0 +1,392 @@
|
||||
# AWS Bedrock Integration
|
||||
|
||||
Pallas supports AWS Bedrock through three integration paths, depending on the model and endpoint:
|
||||
|
||||
| Path | fast-agent provider | Auth | Use when |
|
||||
|---|---|---|---|
|
||||
| [Direct Bedrock](#path-1-direct-bedrock-converse-api) | `bedrock` | AWS IAM / long-term key | Any Bedrock model; required for Sonnet 4.6 |
|
||||
| [Mantle → Anthropic](#path-2-mantle-anthropic-messages-api) | `anthropic` | Bedrock long-term API key | Claude models with Mantle support (Haiku 4.5, Opus 4.7) |
|
||||
| [Mantle → OpenAI](#path-3-mantle-openai-chat-completions) | `openai` | Bedrock long-term API key | Non-Anthropic models on Mantle (MiniMax M2.5, etc.) |
|
||||
|
||||
**Mantle** is AWS's OpenAI-compatible and Anthropic-compatible gateway for Bedrock. It simplifies authentication (one long-term API key instead of IAM credential management) and is the recommended path when the target model supports it.
|
||||
|
||||
---
|
||||
|
||||
## Supported Models
|
||||
|
||||
| Model | Bedrock model ID | Direct Bedrock | Mantle |
|
||||
|---|---|---|---|
|
||||
| Claude Haiku 4.5 | `anthropic.claude-haiku-4-5-20251001-v1:0` | ✓ | ✓ (Anthropic Messages API) |
|
||||
| Claude Sonnet 4.6 | `anthropic.claude-sonnet-4-6` | ✓ | ✗ |
|
||||
| Claude Opus 4.7 | `anthropic.claude-opus-4-7` | ✓ | ✓ (Anthropic Messages API) |
|
||||
| MiniMax M2.5 | `minimax.minimax-m2.5` | ✓ | ✓ (OpenAI Chat Completions) |
|
||||
|
||||
Cross-region inference IDs (e.g. `us.anthropic.claude-opus-4-7`, `eu.anthropic.claude-sonnet-4-6`) can be used as the model ID for the `bedrock` provider to route across regions within a geography for higher throughput.
|
||||
|
||||
---
|
||||
|
||||
## Path 1: Direct Bedrock (Converse API)
|
||||
|
||||
Fast-agent's `bedrock` provider calls the AWS Bedrock Converse API via `boto3`. This path works for all Bedrock models and is the only option for models without Mantle support (e.g. Claude Sonnet 4.6).
|
||||
|
||||
### Prerequisites
|
||||
|
||||
1. **Install `boto3`** — not included in fast-agent by default:
|
||||
|
||||
```toml
|
||||
# pyproject.toml
|
||||
dependencies = [
|
||||
"pallas-mcp @ git+ssh://git@git.helu.ca:22022/r/pallas.git",
|
||||
"boto3",
|
||||
]
|
||||
```
|
||||
|
||||
2. **AWS credentials** — the Bedrock provider uses the standard AWS credential chain in priority order:
|
||||
- `AWS_BEARER_TOKEN_BEDROCK` environment variable (long-term Bedrock API key — see below)
|
||||
- `AWS_ACCESS_KEY_ID` + `AWS_SECRET_ACCESS_KEY` environment variables
|
||||
- `~/.aws/credentials` file (named profile or `default`)
|
||||
- IAM instance role (EC2, ECS, Lambda)
|
||||
|
||||
The simplest approach for a server deployment is a **long-term Bedrock API key** generated from the [Amazon Bedrock console](https://console.aws.amazon.com/bedrock/home#/api-keys/long-term/create). Set it as `AWS_BEARER_TOKEN_BEDROCK`.
|
||||
|
||||
3. **Enable model access** in the [Bedrock console](https://console.aws.amazon.com/bedrock/home#/modelaccess) for your target region.
|
||||
|
||||
### `fastagent.config.yaml`
|
||||
|
||||
```yaml
|
||||
default_model: bedrock.us.anthropic.claude-sonnet-4-6
|
||||
|
||||
# ── Model Capabilities ──────────────────────────────────────────────────────
|
||||
# Required: Bedrock model IDs are not in fast-agent's ModelDatabase.
|
||||
model_capabilities:
|
||||
vision: true # true for Claude models (image input supported)
|
||||
context_window: 1000000 # 1M for Sonnet 4.6
|
||||
max_output_tokens: 64000
|
||||
|
||||
# ── Bedrock provider ─────────────────────────────────────────────────────────
|
||||
bedrock:
|
||||
region: us-east-1 # or set AWS_REGION / AWS_DEFAULT_REGION
|
||||
profile: default # optional; or set AWS_PROFILE
|
||||
reasoning: medium # optional: minimal | low | medium | high
|
||||
```
|
||||
|
||||
The `default_model` format is `bedrock.<model-id>`. Use a cross-region inference ID (e.g. `us.anthropic.claude-sonnet-4-6`) for geo-distributed routing, or the plain model ID (e.g. `anthropic.claude-sonnet-4-6`) for in-region only.
|
||||
|
||||
### `fastagent.secrets.yaml`
|
||||
|
||||
No API key entry is needed — credentials come from the AWS credential chain. If you are using a long-term Bedrock API key, set it in `.env` or the environment:
|
||||
|
||||
```yaml
|
||||
# fastagent.secrets.yaml — nothing required for Bedrock credentials
|
||||
# AWS credentials are read from environment variables or ~/.aws/credentials
|
||||
```
|
||||
|
||||
### `.env`
|
||||
|
||||
```dotenv
|
||||
# Long-term Bedrock API key (recommended for server deployments)
|
||||
AWS_BEARER_TOKEN_BEDROCK=your-bedrock-api-key
|
||||
|
||||
# Or use IAM access keys
|
||||
# AWS_ACCESS_KEY_ID=AKIA...
|
||||
# AWS_SECRET_ACCESS_KEY=...
|
||||
|
||||
AWS_REGION=us-east-1
|
||||
```
|
||||
|
||||
### `agents.yaml`
|
||||
|
||||
No Bedrock-specific changes are needed. The `default_model` in `fastagent.config.yaml` is picked up automatically:
|
||||
|
||||
```yaml
|
||||
name: my-project
|
||||
version: "1.0.0"
|
||||
host: my-host.example.com
|
||||
registry_port: 8200
|
||||
|
||||
agents:
|
||||
jarvis:
|
||||
module: agents.jarvis
|
||||
port: 8201
|
||||
title: Jarvis
|
||||
description: "My assistant"
|
||||
```
|
||||
|
||||
To use a different Bedrock model for a specific agent, set `model` on the agent entry:
|
||||
|
||||
```yaml
|
||||
agents:
|
||||
jarvis:
|
||||
module: agents.jarvis
|
||||
port: 8201
|
||||
model: bedrock.us.anthropic.claude-haiku-4-5-20251001-v1:0
|
||||
model_capabilities:
|
||||
vision: true
|
||||
context_window: 200000
|
||||
max_output_tokens: 64000
|
||||
```
|
||||
|
||||
### Model capability reference
|
||||
|
||||
| Model | `vision` | `context_window` | `max_output_tokens` |
|
||||
|---|---|---|---|
|
||||
| Claude Haiku 4.5 | `true` | `200000` | `64000` |
|
||||
| Claude Sonnet 4.6 | `true` | `1000000` | `64000` |
|
||||
| Claude Opus 4.7 | `true` | `1000000` | `128000` |
|
||||
| MiniMax M2.5 | `false` | `196000` | `8000` |
|
||||
|
||||
### IAM permissions
|
||||
|
||||
The IAM principal (user, role, or instance profile) needs:
|
||||
|
||||
```json
|
||||
{
|
||||
"Effect": "Allow",
|
||||
"Action": [
|
||||
"bedrock:InvokeModel",
|
||||
"bedrock:InvokeModelWithResponseStream"
|
||||
],
|
||||
"Resource": "arn:aws:bedrock:*::foundation-model/*"
|
||||
}
|
||||
```
|
||||
|
||||
For cross-region inference, also allow:
|
||||
|
||||
```json
|
||||
{
|
||||
"Effect": "Allow",
|
||||
"Action": [
|
||||
"bedrock:InvokeModel",
|
||||
"bedrock:InvokeModelWithResponseStream"
|
||||
],
|
||||
"Resource": "arn:aws:bedrock:*:*:inference-profile/*"
|
||||
}
|
||||
```
|
||||
|
||||
### Terraform snippet
|
||||
|
||||
```hcl
|
||||
resource "aws_iam_policy" "bedrock_invoke" {
|
||||
name = "bedrock-invoke"
|
||||
|
||||
policy = jsonencode({
|
||||
Version = "2012-10-17"
|
||||
Statement = [
|
||||
{
|
||||
Effect = "Allow"
|
||||
Action = [
|
||||
"bedrock:InvokeModel",
|
||||
"bedrock:InvokeModelWithResponseStream",
|
||||
]
|
||||
Resource = [
|
||||
"arn:aws:bedrock:*::foundation-model/*",
|
||||
"arn:aws:bedrock:*:*:inference-profile/*",
|
||||
]
|
||||
}
|
||||
]
|
||||
})
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Path 2: Mantle — Anthropic Messages API
|
||||
|
||||
Mantle exposes the Anthropic Messages API for supported Claude models. Fast-agent's `anthropic` provider uses the Anthropic Python SDK (`AsyncAnthropic`), which calls `/v1/messages` — exactly what Mantle serves at `https://bedrock-mantle.{region}.api.aws/anthropic`.
|
||||
|
||||
**Supported models:** Claude Haiku 4.5, Claude Opus 4.7. Claude Sonnet 4.6 does **not** have a Mantle endpoint and must use [Path 1](#path-1-direct-bedrock-converse-api).
|
||||
|
||||
> **Note on Opus 4.7 and Chat Completions:** The AWS model card notes that Opus 4.7 does not support Chat Completions on Mantle. This does not affect fast-agent — the `anthropic` provider uses the Anthropic Messages API, not Chat Completions.
|
||||
|
||||
### Prerequisites
|
||||
|
||||
1. **Generate a long-term Bedrock API key** from the [Amazon Bedrock console](https://console.aws.amazon.com/bedrock/home#/api-keys/long-term/create).
|
||||
|
||||
2. **Enable model access** in the Bedrock console for your target region.
|
||||
|
||||
3. No additional Python packages needed — `anthropic` is already a fast-agent dependency.
|
||||
|
||||
### `fastagent.config.yaml`
|
||||
|
||||
```yaml
|
||||
default_model: anthropic.claude-opus-4-7
|
||||
|
||||
# ── Model Capabilities ──────────────────────────────────────────────────────
|
||||
# mantle: true is REQUIRED — it installs a Pallas-level provider override that
|
||||
# strips the features the Mantle endpoint rejects (anthropic-beta headers,
|
||||
# extended thinking, task budget, web tools, prompt caching). Without this
|
||||
# flag fast-agent sends those features and Mantle returns a misleading
|
||||
# 404 "model does not exist" error.
|
||||
model_capabilities:
|
||||
vision: true
|
||||
context_window: 1000000
|
||||
max_output_tokens: 128000
|
||||
mantle: true
|
||||
|
||||
# ── Anthropic provider pointing at Mantle ────────────────────────────────────
|
||||
anthropic:
|
||||
base_url: "https://bedrock-mantle.us-east-1.api.aws/anthropic"
|
||||
```
|
||||
|
||||
The Anthropic SDK appends `/v1/messages` to `base_url` automatically.
|
||||
|
||||
> **Why `mantle: true` is required.** Fast-agent's built-in `ModelDatabase`
|
||||
> entries for Claude Opus 4.7 and Haiku 4.5 declare features that the
|
||||
> Anthropic API supports but the Mantle endpoint rejects —
|
||||
> `anthropic-beta: code-execution-web-tools-...` headers, extended thinking,
|
||||
> task budget, web search/fetch tools, and prompt caching in some
|
||||
> configurations. When Mantle sees a request carrying those features it
|
||||
> responds with a confusingly generic `{"type": "not_found_error",
|
||||
> "message": "The model '...' does not exist"}`. Pallas reads the `mantle`
|
||||
> flag and writes an entry into fast-agent's `_PROVIDER_MODEL_OVERRIDES`
|
||||
> dict for `(Provider.ANTHROPIC, <model>)` that strips those fields, so
|
||||
> fast-agent sends a plain Messages API request that Mantle accepts.
|
||||
|
||||
|
||||
### `fastagent.secrets.yaml`
|
||||
|
||||
```yaml
|
||||
anthropic:
|
||||
api_key: "${BEDROCK_API_KEY}"
|
||||
```
|
||||
|
||||
### `.env`
|
||||
|
||||
```dotenv
|
||||
BEDROCK_API_KEY=your-bedrock-long-term-api-key
|
||||
```
|
||||
|
||||
### `agents.yaml`
|
||||
|
||||
No Bedrock-specific changes needed. Example:
|
||||
|
||||
```yaml
|
||||
name: my-project
|
||||
version: "1.0.0"
|
||||
host: my-host.example.com
|
||||
registry_port: 8200
|
||||
|
||||
agents:
|
||||
jarvis:
|
||||
module: agents.jarvis
|
||||
port: 8201
|
||||
title: Jarvis
|
||||
description: "My assistant"
|
||||
```
|
||||
|
||||
### IAM permissions
|
||||
|
||||
No IAM permissions are required when using a long-term Bedrock API key. The key itself carries the necessary access. If you need to restrict which models the key can invoke, use resource-based policies in the Bedrock console.
|
||||
|
||||
---
|
||||
|
||||
## Path 3: Mantle — OpenAI Chat Completions
|
||||
|
||||
Mantle exposes an OpenAI-compatible Chat Completions endpoint (`/v1`) for non-Anthropic models such as MiniMax M2.5. Fast-agent's `openai` provider (or `generic` provider) can point at this endpoint.
|
||||
|
||||
**Supported models:** MiniMax M2.5 (`minimax.minimax-m2.5`), and any other Bedrock model that Mantle exposes via Chat Completions.
|
||||
|
||||
### Prerequisites
|
||||
|
||||
1. **Generate a long-term Bedrock API key** from the [Amazon Bedrock console](https://console.aws.amazon.com/bedrock/home#/api-keys/long-term/create).
|
||||
|
||||
2. **Enable model access** in the Bedrock console for your target region.
|
||||
|
||||
### `fastagent.config.yaml`
|
||||
|
||||
```yaml
|
||||
default_model: openai.minimax.minimax-m2.5
|
||||
|
||||
# ── Model Capabilities ──────────────────────────────────────────────────────
|
||||
model_capabilities:
|
||||
vision: false
|
||||
context_window: 196000
|
||||
max_output_tokens: 8000
|
||||
|
||||
# ── OpenAI provider pointing at Mantle ───────────────────────────────────────
|
||||
openai:
|
||||
base_url: "https://bedrock-mantle.us-east-1.api.aws/v1"
|
||||
```
|
||||
|
||||
### `fastagent.secrets.yaml`
|
||||
|
||||
```yaml
|
||||
openai:
|
||||
api_key: "${BEDROCK_API_KEY}"
|
||||
```
|
||||
|
||||
### `.env`
|
||||
|
||||
```dotenv
|
||||
BEDROCK_API_KEY=your-bedrock-long-term-api-key
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Health Checks
|
||||
|
||||
### Startup preflight
|
||||
|
||||
Pallas's `validate_llm_providers()` runs at startup and checks:
|
||||
|
||||
| Provider | What is checked |
|
||||
|---|---|
|
||||
| `anthropic` | `GET {base_url}/v1/models/{model}` — confirms model exists and key is valid |
|
||||
| `openai` | `GET {base_url}/models` — lists models, confirms configured model is present |
|
||||
| `bedrock` | **No preflight check** — credential errors surface on the first inference call |
|
||||
|
||||
For the `bedrock` provider, startup will succeed even with missing or invalid credentials. The first agent call will raise a `ProviderKeyError` with a message directing you to configure AWS credentials.
|
||||
|
||||
### Runtime `get_health` tool
|
||||
|
||||
The `get_health` MCP tool probes downstream MCP servers regardless of which LLM provider is active. LLM provider health (from the startup preflight) is included in the response for `anthropic` and `openai` providers. For `bedrock`, the LLM section of the health response will be absent.
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### `NoCredentialsError` / `ProviderKeyError: AWS credentials not found`
|
||||
|
||||
The `bedrock` provider could not find AWS credentials. Check in order:
|
||||
|
||||
1. Is `AWS_BEARER_TOKEN_BEDROCK` set in `.env` or the environment?
|
||||
2. Is `~/.aws/credentials` present and does it contain the expected profile?
|
||||
3. Is the IAM role attached to the instance/container?
|
||||
|
||||
### Model not found in `ModelDatabase`
|
||||
|
||||
```
|
||||
KeyError: 'anthropic.claude-sonnet-4-6'
|
||||
```
|
||||
|
||||
Pallas requires `model_capabilities` in `fastagent.config.yaml` for any model not in fast-agent's built-in database. All Bedrock model IDs fall into this category. Add:
|
||||
|
||||
```yaml
|
||||
model_capabilities:
|
||||
vision: true # or false
|
||||
context_window: 1000000
|
||||
max_output_tokens: 64000
|
||||
```
|
||||
|
||||
### `ValidationError` on `default_model`
|
||||
|
||||
The `default_model` format must be `provider.model-id`. Examples:
|
||||
|
||||
```yaml
|
||||
default_model: bedrock.us.anthropic.claude-sonnet-4-6 # Direct Bedrock, geo inference
|
||||
default_model: bedrock.anthropic.claude-sonnet-4-6 # Direct Bedrock, in-region
|
||||
default_model: anthropic.claude-opus-4-7 # Mantle via Anthropic provider
|
||||
default_model: openai.minimax.minimax-m2.5 # Mantle via OpenAI provider
|
||||
```
|
||||
|
||||
### Cross-region inference access denied
|
||||
|
||||
If you use a geo inference ID (e.g. `us.anthropic.claude-sonnet-4-6`) and receive an access denied error, ensure the IAM policy includes `arn:aws:bedrock:*:*:inference-profile/*` in the `Resource` list. In-region model IDs do not require this.
|
||||
|
||||
### Mantle 401 Unauthorized
|
||||
|
||||
The Bedrock long-term API key is invalid or expired. Regenerate it from the [Bedrock console](https://console.aws.amazon.com/bedrock/home#/api-keys/long-term/create) and update `BEDROCK_API_KEY` in `.env`.
|
||||
|
||||
### Claude Sonnet 4.6 on Mantle returns 404
|
||||
|
||||
Claude Sonnet 4.6 does not have a Mantle endpoint. Use the `bedrock` provider (Path 1) with model ID `anthropic.claude-sonnet-4-6` or the geo inference ID `us.anthropic.claude-sonnet-4-6`.
|
||||
Reference in New Issue
Block a user