16 KiB
AWS Bedrock Integration
Pallas supports AWS Bedrock through three integration paths, depending on the model and endpoint:
| Path | fast-agent provider | Auth | Use when |
|---|---|---|---|
| Direct Bedrock | bedrock |
AWS IAM / long-term key | Any Bedrock model; required for Sonnet 4.6 |
| Mantle → Anthropic | anthropic |
Bedrock long-term API key | Claude models with Mantle support (Haiku 4.5, Opus 4.7) |
| Mantle → OpenAI | openai |
Bedrock long-term API key | Non-Anthropic models on Mantle (MiniMax M2.5, etc.) |
Mantle is AWS's OpenAI-compatible and Anthropic-compatible gateway for Bedrock. It simplifies authentication (one long-term API key instead of IAM credential management) and is the recommended path when the target model supports it.
Supported Models
| Model | Bedrock model ID | Direct Bedrock | Mantle |
|---|---|---|---|
| Claude Haiku 4.5 | anthropic.claude-haiku-4-5-20251001-v1:0 |
✓ | ✓ (Anthropic Messages API) |
| Claude Sonnet 4.6 | anthropic.claude-sonnet-4-6 |
✓ | ✗ |
| Claude Opus 4.7 | anthropic.claude-opus-4-7 |
✓ | ✓ (Anthropic Messages API) |
| MiniMax M2.5 | minimax.minimax-m2.5 |
✓ | ✓ (OpenAI Chat Completions) |
Cross-region inference IDs (e.g. us.anthropic.claude-opus-4-7, eu.anthropic.claude-sonnet-4-6) can be used as the model ID for the bedrock provider to route across regions within a geography for higher throughput.
Path 1: Direct Bedrock (Converse API)
Fast-agent's bedrock provider calls the AWS Bedrock Converse API via boto3. This path works for all Bedrock models and is the only option for models without Mantle support (e.g. Claude Sonnet 4.6).
Prerequisites
-
Install
boto3— not included in fast-agent by default:# pyproject.toml dependencies = [ "pallas-mcp @ git+ssh://git@git.helu.ca:22022/r/pallas.git", "boto3", ] -
AWS credentials — the Bedrock provider uses the standard AWS credential chain in priority order:
AWS_BEARER_TOKEN_BEDROCKenvironment variable (long-term Bedrock API key — see below)AWS_ACCESS_KEY_ID+AWS_SECRET_ACCESS_KEYenvironment variables~/.aws/credentialsfile (named profile ordefault)- IAM instance role (EC2, ECS, Lambda)
The simplest approach for a server deployment is a long-term Bedrock API key generated from the Amazon Bedrock console. Set it as
AWS_BEARER_TOKEN_BEDROCK. -
Enable model access in the Bedrock console for your target region.
fastagent.config.yaml
default_model: bedrock.us.anthropic.claude-sonnet-4-6
# ── Model Capabilities ──────────────────────────────────────────────────────
# Required: Bedrock model IDs are not in fast-agent's ModelDatabase.
model_capabilities:
vision: true # true for Claude models (image input supported)
context_window: 1000000 # 1M for Sonnet 4.6
max_output_tokens: 64000
# ── Bedrock provider ─────────────────────────────────────────────────────────
bedrock:
region: us-east-1 # or set AWS_REGION / AWS_DEFAULT_REGION
profile: default # optional; or set AWS_PROFILE
reasoning: medium # optional: minimal | low | medium | high
The default_model format is bedrock.<model-id>. Use a cross-region inference ID (e.g. us.anthropic.claude-sonnet-4-6) for geo-distributed routing, or the plain model ID (e.g. anthropic.claude-sonnet-4-6) for in-region only.
fastagent.secrets.yaml
No API key entry is needed — credentials come from the AWS credential chain. If you are using a long-term Bedrock API key, set it in .env or the environment:
# fastagent.secrets.yaml — nothing required for Bedrock credentials
# AWS credentials are read from environment variables or ~/.aws/credentials
.env
# Long-term Bedrock API key (recommended for server deployments)
AWS_BEARER_TOKEN_BEDROCK=your-bedrock-api-key
# Or use IAM access keys
# AWS_ACCESS_KEY_ID=AKIA...
# AWS_SECRET_ACCESS_KEY=...
AWS_REGION=us-east-1
agents.yaml
No Bedrock-specific changes are needed. The default_model in fastagent.config.yaml is picked up automatically:
name: my-project
version: "1.0.0"
host: my-host.example.com
registry_port: 8200
agents:
jarvis:
module: agents.jarvis
port: 8201
title: Jarvis
description: "My assistant"
To use a different Bedrock model for a specific agent, set model on the agent entry:
agents:
jarvis:
module: agents.jarvis
port: 8201
model: bedrock.us.anthropic.claude-haiku-4-5-20251001-v1:0
model_capabilities:
vision: true
context_window: 200000
max_output_tokens: 64000
Model capability reference
| Model | vision |
context_window |
max_output_tokens |
|---|---|---|---|
| Claude Haiku 4.5 | true |
200000 |
64000 |
| Claude Sonnet 4.6 | true |
1000000 |
64000 |
| Claude Opus 4.7 | true |
1000000 |
128000 |
| MiniMax M2.5 | false |
196000 |
8000 |
IAM permissions
The IAM principal (user, role, or instance profile) needs:
{
"Effect": "Allow",
"Action": [
"bedrock:InvokeModel",
"bedrock:InvokeModelWithResponseStream"
],
"Resource": "arn:aws:bedrock:*::foundation-model/*"
}
For cross-region inference, also allow:
{
"Effect": "Allow",
"Action": [
"bedrock:InvokeModel",
"bedrock:InvokeModelWithResponseStream"
],
"Resource": "arn:aws:bedrock:*:*:inference-profile/*"
}
Terraform snippet
resource "aws_iam_policy" "bedrock_invoke" {
name = "bedrock-invoke"
policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Effect = "Allow"
Action = [
"bedrock:InvokeModel",
"bedrock:InvokeModelWithResponseStream",
]
Resource = [
"arn:aws:bedrock:*::foundation-model/*",
"arn:aws:bedrock:*:*:inference-profile/*",
]
}
]
})
}
Path 2: Mantle — Anthropic Messages API
Mantle exposes the Anthropic Messages API for supported Claude models. Fast-agent's anthropic provider uses the Anthropic Python SDK (AsyncAnthropic), which calls /v1/messages — exactly what Mantle serves at https://bedrock-mantle.{region}.api.aws/anthropic.
Supported models: Claude Haiku 4.5, Claude Opus 4.7. Claude Sonnet 4.6 does not have a Mantle endpoint and must use Path 1.
Note on Opus 4.7 and Chat Completions: The AWS model card notes that Opus 4.7 does not support Chat Completions on Mantle. This does not affect fast-agent — the
anthropicprovider uses the Anthropic Messages API, not Chat Completions.
Prerequisites
-
Generate a long-term Bedrock API key from the Amazon Bedrock console.
-
Enable model access in the Bedrock console for your target region.
-
No additional Python packages needed —
anthropicis already a fast-agent dependency.
fastagent.config.yaml
default_model: anthropic.claude-opus-4-7
# ── Anthropic provider pointing at Mantle ────────────────────────────────────
anthropic:
base_url: "https://bedrock-mantle.us-east-1.api.aws/anthropic"
That's the whole configuration. Pallas auto-detects the
bedrock-mantle hostname in anthropic.base_url at startup and installs
two compatibility shims so fast-agent's default request shape matches
what Mantle expects (see pallas/mantle_shims.py):
-
Wire-name prefix — re-adds the
anthropic.prefix that fast-agent's parser strips off, because Mantle requires the fullanthropic.claude-opus-4-7wire id. Without this shim you get404 "The model '...' does not exist". -
caller: nullstrip — drops the straycallerfield Anthropic SDK 0.100.x leaks onto replayedtool_useblocks (upstream issue anthropics/anthropic-sdk-python#1454). Mantle's validator rejectscaller: nullwith"tool_use.caller: Input should be a valid dictionary or object", which would otherwise break the MCP tool-use loop on the second turn.
The Anthropic SDK appends /v1/messages to base_url automatically.
Feature support. Mantle accepts the same Messages API request shape
as api.anthropic.com once the shims are in place, including full MCP
tool use (tools, tool_use/tool_result content blocks). Extended
thinking, task budget, web_fetch/web_search server tools, and explicit
prompt caching (cache_control) are not available via Mantle and should
be left off in agent code when targeting Mantle — fast-agent's
ModelDatabase entries already disable the ones the Anthropic SDK 0.100.x
would otherwise auto-attach.
fastagent.secrets.yaml
anthropic:
api_key: "${BEDROCK_API_KEY}"
.env
BEDROCK_API_KEY=your-bedrock-long-term-api-key
agents.yaml
No Bedrock-specific changes needed. Example:
name: my-project
version: "1.0.0"
host: my-host.example.com
registry_port: 8200
agents:
jarvis:
module: agents.jarvis
port: 8201
title: Jarvis
description: "My assistant"
IAM permissions
No IAM permissions are required when using a long-term Bedrock API key. The key itself carries the necessary access. If you need to restrict which models the key can invoke, use resource-based policies in the Bedrock console.
Path 3: Mantle — OpenAI Chat Completions
Mantle exposes an OpenAI-compatible Chat Completions endpoint (/v1) for non-Anthropic models such as MiniMax M2.5. Fast-agent's openai provider (or generic provider) can point at this endpoint.
Supported models: MiniMax M2.5 (minimax.minimax-m2.5), and any other Bedrock model that Mantle exposes via Chat Completions.
Prerequisites
-
Generate a long-term Bedrock API key from the Amazon Bedrock console.
-
Enable model access in the Bedrock console for your target region.
fastagent.config.yaml
default_model: openai.minimax.minimax-m2.5
# ── Model Capabilities ──────────────────────────────────────────────────────
model_capabilities:
vision: false
context_window: 196000
max_output_tokens: 8000
# ── OpenAI provider pointing at Mantle ───────────────────────────────────────
openai:
base_url: "https://bedrock-mantle.us-east-1.api.aws/v1"
fastagent.secrets.yaml
openai:
api_key: "${BEDROCK_API_KEY}"
.env
BEDROCK_API_KEY=your-bedrock-long-term-api-key
Health Checks
Startup preflight
Pallas's validate_llm_providers() runs at startup and caches a status for the active provider (the one named by default_model). The cached value is read back by get_health() on every MCP get_health tool call, so Daedalus (or any headless consumer) can see why an agent is degraded when there's no fast-agent TUI to surface it.
Preflight probes are deliberately chosen to be free of inference tokens. Each provider has a dedicated probe:
| Provider | Probe |
|---|---|
anthropic (direct — api.anthropic.com or empty base_url) |
GET {base_url}/models/{model} — confirms model exists and the API key is valid |
anthropic (Mantle — bedrock-mantle.{region}.api.aws/anthropic) |
GET {region_root}/v1/models/{wire_model} — Mantle serves its model catalogue at the region root, not under /anthropic; Pallas strips the /anthropic suffix and applies pallas.mantle_shims.MANTLE_WIRE_NAMES to turn claude-opus-4-7 into anthropic.claude-opus-4-7. The IAM policy for the long-term Bedrock API key must include bedrock-mantle:ListModels / bedrock-mantle:GetModel for this probe to return 200. |
openai |
GET {base_url}/models — lists models, confirms configured model is present |
generic |
GET {base_url}/models — status-code-only probe (body is not inspected). llama.cpp's /v1/models response isn't strictly OpenAI-shaped and users hot-swap models by name, so a 200 is enough |
bedrock |
No HTTP request. ok when any of AWS_BEARER_TOKEN_BEDROCK, AWS_ACCESS_KEY_ID+AWS_SECRET_ACCESS_KEY, AWS_PROFILE, or ~/.aws/credentials is present; error otherwise. Bedrock's Converse API has no cheap health endpoint and the first inference call will surface any real credential problem within seconds |
| Unknown / malformed provider | No HTTP request; error: unknown provider 'X' in default_model. Prevents silent "looks degraded" lies when default_model is mistyped |
API key resolution for every provider goes through fast_agent.llm.provider_key_manager.ProviderKeyManager.get_api_key, so the preflight reads keys from the exact same place the real LLM client does — config file, env var, Codex OAuth, HF hub, etc. Duplicate key-loading logic inside pallas.health has been removed.
Runtime get_health tool
The get_health MCP tool probes downstream MCP servers on every call and includes the cached LLM preflight status in the response. If the active provider's cached status isn't ok, get_health returns status: degraded with an LLM: <provider>: <message> prefix appended to the message field.
Troubleshooting
NoCredentialsError / ProviderKeyError: AWS credentials not found
The bedrock provider could not find AWS credentials. Check in order:
- Is
AWS_BEARER_TOKEN_BEDROCKset in.envor the environment? - Is
~/.aws/credentialspresent and does it contain the expected profile? - Is the IAM role attached to the instance/container?
Model not found in ModelDatabase
KeyError: 'anthropic.claude-sonnet-4-6'
Pallas requires model_capabilities in fastagent.config.yaml for any model not in fast-agent's built-in database. All Bedrock model IDs fall into this category. Add:
model_capabilities:
vision: true # or false
context_window: 1000000
max_output_tokens: 64000
ValidationError on default_model
The default_model format must be provider.model-id. Examples:
default_model: bedrock.us.anthropic.claude-sonnet-4-6 # Direct Bedrock, geo inference
default_model: bedrock.anthropic.claude-sonnet-4-6 # Direct Bedrock, in-region
default_model: anthropic.claude-opus-4-7 # Mantle via Anthropic provider
default_model: openai.minimax.minimax-m2.5 # Mantle via OpenAI provider
Cross-region inference access denied
If you use a geo inference ID (e.g. us.anthropic.claude-sonnet-4-6) and receive an access denied error, ensure the IAM policy includes arn:aws:bedrock:*:*:inference-profile/* in the Resource list. In-region model IDs do not require this.
Mantle 401 Unauthorized
The Bedrock long-term API key is invalid or expired. Regenerate it from the Bedrock console and update BEDROCK_API_KEY in .env.
Claude Sonnet 4.6 on Mantle returns 404
Claude Sonnet 4.6 does not have a Mantle endpoint. Use the bedrock provider (Path 1) with model ID anthropic.claude-sonnet-4-6 or the geo inference ID us.anthropic.claude-sonnet-4-6.