r/pallas

Files

Robert Helewka fe94f6a9a8 feat: add Mantle override for AWS Bedrock Anthropic endpoint

Introduce `model_capabilities.mantle` flag that installs a provider-specific
override in fast-agent's `ModelDatabase._PROVIDER_MODEL_OVERRIDES` to strip
features the AWS Bedrock Mantle endpoint rejects (beta headers, extended
thinking, task budgets, web tools, prompt caching).

Without this override, fast-agent sends default beta headers and `thinking`
parameters for modern Claude models that Mantle rejects with a misleading
404 "model does not exist" error.

2026-05-12 07:41:41 -04:00

14 KiB

Raw Blame History

AWS Bedrock Integration

Pallas supports AWS Bedrock through three integration paths, depending on the model and endpoint:

Path	fast-agent provider	Auth	Use when
Direct Bedrock	`bedrock`	AWS IAM / long-term key	Any Bedrock model; required for Sonnet 4.6
Mantle → Anthropic	`anthropic`	Bedrock long-term API key	Claude models with Mantle support (Haiku 4.5, Opus 4.7)
Mantle → OpenAI	`openai`	Bedrock long-term API key	Non-Anthropic models on Mantle (MiniMax M2.5, etc.)

Mantle is AWS's OpenAI-compatible and Anthropic-compatible gateway for Bedrock. It simplifies authentication (one long-term API key instead of IAM credential management) and is the recommended path when the target model supports it.

Supported Models

Model	Bedrock model ID	Direct Bedrock	Mantle
Claude Haiku 4.5	`anthropic.claude-haiku-4-5-20251001-v1:0`	✓	✓ (Anthropic Messages API)
Claude Sonnet 4.6	`anthropic.claude-sonnet-4-6`	✓	✗
Claude Opus 4.7	`anthropic.claude-opus-4-7`	✓	✓ (Anthropic Messages API)
MiniMax M2.5	`minimax.minimax-m2.5`	✓	✓ (OpenAI Chat Completions)

Cross-region inference IDs (e.g. us.anthropic.claude-opus-4-7, eu.anthropic.claude-sonnet-4-6) can be used as the model ID for the bedrock provider to route across regions within a geography for higher throughput.

Path 1: Direct Bedrock (Converse API)

Fast-agent's bedrock provider calls the AWS Bedrock Converse API via boto3. This path works for all Bedrock models and is the only option for models without Mantle support (e.g. Claude Sonnet 4.6).

Prerequisites

Install boto3 — not included in fast-agent by default:

# pyproject.toml
dependencies = [
    "pallas-mcp @ git+ssh://git@git.helu.ca:22022/r/pallas.git",
    "boto3",
]

AWS credentials — the Bedrock provider uses the standard AWS credential chain in priority order:
- AWS_BEARER_TOKEN_BEDROCK environment variable (long-term Bedrock API key — see below)
- AWS_ACCESS_KEY_ID + AWS_SECRET_ACCESS_KEY environment variables
- ~/.aws/credentials file (named profile or default)
- IAM instance role (EC2, ECS, Lambda)
The simplest approach for a server deployment is a long-term Bedrock API key generated from the Amazon Bedrock console. Set it as AWS_BEARER_TOKEN_BEDROCK.
Enable model access in the Bedrock console for your target region.

`fastagent.config.yaml`

default_model: bedrock.us.anthropic.claude-sonnet-4-6

# ── Model Capabilities ──────────────────────────────────────────────────────
# Required: Bedrock model IDs are not in fast-agent's ModelDatabase.
model_capabilities:
  vision: true                  # true for Claude models (image input supported)
  context_window: 1000000       # 1M for Sonnet 4.6
  max_output_tokens: 64000

# ── Bedrock provider ─────────────────────────────────────────────────────────
bedrock:
  region: us-east-1             # or set AWS_REGION / AWS_DEFAULT_REGION
  profile: default              # optional; or set AWS_PROFILE
  reasoning: medium             # optional: minimal | low | medium | high

The default_model format is bedrock.<model-id>. Use a cross-region inference ID (e.g. us.anthropic.claude-sonnet-4-6) for geo-distributed routing, or the plain model ID (e.g. anthropic.claude-sonnet-4-6) for in-region only.

`fastagent.secrets.yaml`

No API key entry is needed — credentials come from the AWS credential chain. If you are using a long-term Bedrock API key, set it in .env or the environment:

# fastagent.secrets.yaml — nothing required for Bedrock credentials
# AWS credentials are read from environment variables or ~/.aws/credentials

`.env`

# Long-term Bedrock API key (recommended for server deployments)
AWS_BEARER_TOKEN_BEDROCK=your-bedrock-api-key

# Or use IAM access keys
# AWS_ACCESS_KEY_ID=AKIA...
# AWS_SECRET_ACCESS_KEY=...

AWS_REGION=us-east-1

`agents.yaml`

No Bedrock-specific changes are needed. The default_model in fastagent.config.yaml is picked up automatically:

name: my-project
version: "1.0.0"
host: my-host.example.com
registry_port: 8200

agents:
  jarvis:
    module: agents.jarvis
    port: 8201
    title: Jarvis
    description: "My assistant"

To use a different Bedrock model for a specific agent, set model on the agent entry:

agents:
  jarvis:
    module: agents.jarvis
    port: 8201
    model: bedrock.us.anthropic.claude-haiku-4-5-20251001-v1:0
    model_capabilities:
      vision: true
      context_window: 200000
      max_output_tokens: 64000

Model capability reference

Model	`vision`	`context_window`	`max_output_tokens`
Claude Haiku 4.5	`true`	`200000`	`64000`
Claude Sonnet 4.6	`true`	`1000000`	`64000`
Claude Opus 4.7	`true`	`1000000`	`128000`
MiniMax M2.5	`false`	`196000`	`8000`

IAM permissions

The IAM principal (user, role, or instance profile) needs:

{
  "Effect": "Allow",
  "Action": [
    "bedrock:InvokeModel",
    "bedrock:InvokeModelWithResponseStream"
  ],
  "Resource": "arn:aws:bedrock:*::foundation-model/*"
}

For cross-region inference, also allow:

{
  "Effect": "Allow",
  "Action": [
    "bedrock:InvokeModel",
    "bedrock:InvokeModelWithResponseStream"
  ],
  "Resource": "arn:aws:bedrock:*:*:inference-profile/*"
}

Terraform snippet

resource "aws_iam_policy" "bedrock_invoke" {
  name = "bedrock-invoke"

  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Effect = "Allow"
        Action = [
          "bedrock:InvokeModel",
          "bedrock:InvokeModelWithResponseStream",
        ]
        Resource = [
          "arn:aws:bedrock:*::foundation-model/*",
          "arn:aws:bedrock:*:*:inference-profile/*",
        ]
      }
    ]
  })
}

Path 2: Mantle — Anthropic Messages API

Mantle exposes the Anthropic Messages API for supported Claude models. Fast-agent's anthropic provider uses the Anthropic Python SDK (AsyncAnthropic), which calls /v1/messages — exactly what Mantle serves at https://bedrock-mantle.{region}.api.aws/anthropic.

Supported models: Claude Haiku 4.5, Claude Opus 4.7. Claude Sonnet 4.6 does not have a Mantle endpoint and must use Path 1.

Note on Opus 4.7 and Chat Completions: The AWS model card notes that Opus 4.7 does not support Chat Completions on Mantle. This does not affect fast-agent — the anthropic provider uses the Anthropic Messages API, not Chat Completions.

Prerequisites

Generate a long-term Bedrock API key from the Amazon Bedrock console.
Enable model access in the Bedrock console for your target region.
No additional Python packages needed — anthropic is already a fast-agent dependency.

`fastagent.config.yaml`

default_model: anthropic.claude-opus-4-7

# ── Model Capabilities ──────────────────────────────────────────────────────
# mantle: true is REQUIRED — it installs a Pallas-level provider override that
# strips the features the Mantle endpoint rejects (anthropic-beta headers,
# extended thinking, task budget, web tools, prompt caching). Without this
# flag fast-agent sends those features and Mantle returns a misleading
# 404 "model does not exist" error.
model_capabilities:
  vision: true
  context_window: 1000000
  max_output_tokens: 128000
  mantle: true

# ── Anthropic provider pointing at Mantle ────────────────────────────────────
anthropic:
  base_url: "https://bedrock-mantle.us-east-1.api.aws/anthropic"

The Anthropic SDK appends /v1/messages to base_url automatically.

Why mantle: true is required. Fast-agent's built-in ModelDatabase entries for Claude Opus 4.7 and Haiku 4.5 declare features that the Anthropic API supports but the Mantle endpoint rejects — anthropic-beta: code-execution-web-tools-... headers, extended thinking, task budget, web search/fetch tools, and prompt caching in some configurations. When Mantle sees a request carrying those features it responds with a confusingly generic {"type": "not_found_error", "message": "The model '...' does not exist"}. Pallas reads the mantle flag and writes an entry into fast-agent's _PROVIDER_MODEL_OVERRIDES dict for (Provider.ANTHROPIC, <model>) that strips those fields, so fast-agent sends a plain Messages API request that Mantle accepts.

`fastagent.secrets.yaml`

anthropic:
  api_key: "${BEDROCK_API_KEY}"

`.env`

BEDROCK_API_KEY=your-bedrock-long-term-api-key

`agents.yaml`

No Bedrock-specific changes needed. Example:

name: my-project
version: "1.0.0"
host: my-host.example.com
registry_port: 8200

agents:
  jarvis:
    module: agents.jarvis
    port: 8201
    title: Jarvis
    description: "My assistant"

IAM permissions

No IAM permissions are required when using a long-term Bedrock API key. The key itself carries the necessary access. If you need to restrict which models the key can invoke, use resource-based policies in the Bedrock console.

Path 3: Mantle — OpenAI Chat Completions

Mantle exposes an OpenAI-compatible Chat Completions endpoint (/v1) for non-Anthropic models such as MiniMax M2.5. Fast-agent's openai provider (or generic provider) can point at this endpoint.

Supported models: MiniMax M2.5 (minimax.minimax-m2.5), and any other Bedrock model that Mantle exposes via Chat Completions.

Prerequisites

Generate a long-term Bedrock API key from the Amazon Bedrock console.
Enable model access in the Bedrock console for your target region.

`fastagent.config.yaml`

default_model: openai.minimax.minimax-m2.5

# ── Model Capabilities ──────────────────────────────────────────────────────
model_capabilities:
  vision: false
  context_window: 196000
  max_output_tokens: 8000

# ── OpenAI provider pointing at Mantle ───────────────────────────────────────
openai:
  base_url: "https://bedrock-mantle.us-east-1.api.aws/v1"

`fastagent.secrets.yaml`

openai:
  api_key: "${BEDROCK_API_KEY}"

`.env`

BEDROCK_API_KEY=your-bedrock-long-term-api-key

Health Checks

Startup preflight

Pallas's validate_llm_providers() runs at startup and checks:

Provider	What is checked
`anthropic`	`GET {base_url}/v1/models/{model}` — confirms model exists and key is valid
`openai`	`GET {base_url}/models` — lists models, confirms configured model is present
`bedrock`	No preflight check — credential errors surface on the first inference call

For the bedrock provider, startup will succeed even with missing or invalid credentials. The first agent call will raise a ProviderKeyError with a message directing you to configure AWS credentials.

Runtime `get_health` tool

The get_health MCP tool probes downstream MCP servers regardless of which LLM provider is active. LLM provider health (from the startup preflight) is included in the response for anthropic and openai providers. For bedrock, the LLM section of the health response will be absent.

Troubleshooting

`NoCredentialsError` / `ProviderKeyError: AWS credentials not found`

The bedrock provider could not find AWS credentials. Check in order:

Is AWS_BEARER_TOKEN_BEDROCK set in .env or the environment?
Is ~/.aws/credentials present and does it contain the expected profile?
Is the IAM role attached to the instance/container?

Model not found in `ModelDatabase`

KeyError: 'anthropic.claude-sonnet-4-6'

Pallas requires model_capabilities in fastagent.config.yaml for any model not in fast-agent's built-in database. All Bedrock model IDs fall into this category. Add:

model_capabilities:
  vision: true          # or false
  context_window: 1000000
  max_output_tokens: 64000

`ValidationError` on `default_model`

The default_model format must be provider.model-id. Examples:

default_model: bedrock.us.anthropic.claude-sonnet-4-6   # Direct Bedrock, geo inference
default_model: bedrock.anthropic.claude-sonnet-4-6       # Direct Bedrock, in-region
default_model: anthropic.claude-opus-4-7                 # Mantle via Anthropic provider
default_model: openai.minimax.minimax-m2.5               # Mantle via OpenAI provider

Cross-region inference access denied

If you use a geo inference ID (e.g. us.anthropic.claude-sonnet-4-6) and receive an access denied error, ensure the IAM policy includes arn:aws:bedrock:*:*:inference-profile/* in the Resource list. In-region model IDs do not require this.

Mantle 401 Unauthorized

The Bedrock long-term API key is invalid or expired. Regenerate it from the Bedrock console and update BEDROCK_API_KEY in .env.

Claude Sonnet 4.6 on Mantle returns 404

Claude Sonnet 4.6 does not have a Mantle endpoint. Use the bedrock provider (Path 1) with model ID anthropic.claude-sonnet-4-6 or the geo inference ID us.anthropic.claude-sonnet-4-6.

14 KiB Raw Blame History

AWS Bedrock Integration

Supported Models

Path 1: Direct Bedrock (Converse API)

Prerequisites

fastagent.config.yaml

fastagent.secrets.yaml

.env

agents.yaml

Model capability reference

IAM permissions

Terraform snippet

Path 2: Mantle — Anthropic Messages API

Prerequisites

fastagent.config.yaml

fastagent.secrets.yaml

.env

agents.yaml

IAM permissions

Path 3: Mantle — OpenAI Chat Completions

Prerequisites

fastagent.config.yaml

fastagent.secrets.yaml

.env

Health Checks

Startup preflight

Runtime get_health tool

Troubleshooting

NoCredentialsError / ProviderKeyError: AWS credentials not found

Model not found in ModelDatabase

ValidationError on default_model

Cross-region inference access denied

Mantle 401 Unauthorized

Claude Sonnet 4.6 on Mantle returns 404

14 KiB

Raw Blame History

`fastagent.config.yaml`

`fastagent.secrets.yaml`

`.env`

`agents.yaml`

`fastagent.config.yaml`

`fastagent.secrets.yaml`

`.env`

`agents.yaml`

`fastagent.config.yaml`

`fastagent.secrets.yaml`

`.env`

Runtime `get_health` tool

`NoCredentialsError` / `ProviderKeyError: AWS credentials not found`

Model not found in `ModelDatabase`

`ValidationError` on `default_model`