Commit Graph

9 Commits

Author SHA1 Message Date
fe94f6a9a8 feat: add Mantle override for AWS Bedrock Anthropic endpoint
Introduce `model_capabilities.mantle` flag that installs a provider-specific
override in fast-agent's `ModelDatabase._PROVIDER_MODEL_OVERRIDES` to strip
features the AWS Bedrock Mantle endpoint rejects (beta headers, extended
thinking, task budgets, web tools, prompt caching).

Without this override, fast-agent sends default beta headers and `thinking`
parameters for modern Claude models that Mantle rejects with a misleading
404 "model does not exist" error.
2026-05-12 07:41:41 -04:00
4b954ed842 docs: add Claude Haiku 4.5 model card documentation
Add comprehensive model card for Anthropic's Claude Haiku 4.5 on AWS
Bedrock, including model details, capabilities, pricing, programmatic
access examples, and regional availability information.
2026-05-12 06:29:46 -04:00
49da024877 docs: add Incidents & Lessons Learned section for Pallas<->Mnemosyne saga
Capture the five debugging chapters from the bearer-forwarding rollout
so the knowledge doesn't live only in chat history:

  1. Per-request bearer across anyio.TaskGroup boundary (ContextVar
     snapshot semantics, httpx auth-header caching on persistent
     connections, forward_inbound_auth pydantic-drop workaround).
  2. install() idempotency guard shadowing three newly-added
     monkey-patches — each patch now owns its own sentinel.
  3. FastMCP on_call_tool context shape: context.message.name, not
     context.message.params.name.  Extractor returning None silently
     killed the _PUBLIC_TOOLS bypass and downstream dispatch "await
     None(...)" produced the terse 'object NoneType can't be used in
     await expression' string that blocked Harper<->Mnemosyne.
  4. Rich-TUI corruption by DEBUG openai/sse_starlette/mcp via root
     logger inheriting logger.level=debug + our stderr StreamHandler.
     Fixed by PALLAS_LOG_STDERR gate and PALLAS_ROOT_LOG_LEVEL split.
  5. Current state table of PALLAS_LOG_* knobs + jq tail recipe.

Also add pallas.log and pallas._fastagent_patch to the Module Reference
table.
2026-05-07 06:32:24 -04:00
679a809f66 Fix bearer forwarding across anyio TaskGroup boundary
The Mnemosyne Authorization: Bearer token was being dropped on outbound MCP
calls because fast-agent runs downstream transports inside a long-lived
anyio TaskGroup whose context is snapshotted at manager startup —
request_bearer_token.get() inside _prepare_headers_and_auth therefore
always resolved to None even when the request handler had just set it.

Fix:
* pallas/_fastagent_patch.py
    - add _pending_bearers registry keyed by id(server_config) with a
      threading.Lock; publish_bearer / revoke_bearer helpers.
    - patched _prepare_headers_and_auth reads the registry first, falls
      back to the ContextVar for non-persistent probe paths.
    - emit INFO log on install() so the journal shows the patch ran;
      verbose flow logs at DEBUG on pallas.forward.

* pallas/multimodal_server.py
    - send_message resolves the agent's opted-in downstreams, publishes
      the inbound bearer for each, and revokes them all in the finally.
    - bearer/header diagnostics go to pallas.auth (DEBUG) instead of
      /tmp/pallas-bearer.log which is invisible under systemd PrivateTmp.

* pallas/log.py
    - honour PALLAS_LOG_LEVEL env var (default INFO) so operators can
      flip the forward/auth diagnostics on without a code change.

* docs/pallas.md, docs/mnemosyne_integration.md
    - document the registry-based forwarding and the task-group
      ContextVar constraint that forced it.
2026-05-05 12:09:51 -04:00
705b4f8cbe Docs: update 2026-05-04 15:34:51 -04:00
be71709608 feat(pallas): add opt-in bearer token forwarding to downstream MCP servers
Introduce per-server `forward_inbound_auth` flag that controls whether the
inbound MCP bearer token is propagated to outbound MCP transport calls.
Implemented as a fast-agent monkey-patch auto-installed on package import,
preventing accidental credential leakage to unrelated downstream servers.

Update docs to describe the two bearer token consumers (LLM provider
passthrough and opt-in downstream MCP forwarding) with a config example.
2026-05-03 17:17:50 -04:00
95fa6e6fc0 feat!: stateless per-request agents; add history + conversation_id to send_message
Make Pallas truly stateless per the 'Pallas is ephemeral' contract.

BREAKING (behavioural, not API):
  * instance_scope changes from 'shared' to 'request' in pallas.server.
    Each MCP tools/call now acquires a freshly-created fast-agent instance
    via the existing create_instance / dispose_instance factories and
    disposes it immediately after the response.

With 'shared' mode:
  * Every MCP caller saw the same agent.message_history, so different
    Daedalus conversations leaked into each other.
  * Mid-chat context was silently truncated once the model window filled.
  * Restarting the Pallas process wiped all in-flight conversation state,
    even though Daedalus had it persisted in Postgres.

With 'request' mode the Pallas process holds no per-conversation state;
the caller (Daedalus) owns history and reseeds it on every turn.

send_message gains two optional arguments:
  * history: list[{role, content, images?}] in chronological order,
    converted to PromptMessageExtended and seeded onto the fresh
    instance's message_history before agent.send().
  * conversation_id: opaque string, logged for trace correlation only —
    Pallas never interprets or persists it.

Malformed history entries (bad role, missing image data/mime_type, etc.)
are skipped with a warning rather than raising, so a single bad row
cannot wipe a whole conversation.

The {agent}_history MCP prompt is still registered under 'request'
scope for backward compatibility but always returns []; history lives
on the client.

Version bumped to 0.2.0.
2026-04-27 08:16:59 -04:00
35cc2143b1 Update Red Panda Standards Doc 2026-04-10 14:01:25 +00:00
0cea5ece3a feat: add /healthz and /metrics endpoints, replace print with logging
- Add /healthz endpoint returning LLM provider validation status
- Add /metrics endpoint serving Prometheus metrics via prometheus_client
- Replace all print() calls in health.py with proper logging module
- Remove _PREFIX variable in favor of structured logger context
2026-04-10 11:22:26 +00:00