Commit Graph

6 Commits

Author SHA1 Message Date
ea37ab38c1 feat: add loop guard to halt repeated-identical tool call loops
Introduces `pallas.loop_guard` module that detects and halts agentic loops
where the same `(tool, args) → result` repeats consecutively, preventing
wasted LLM turns when upstream MCP servers return contradictory data.

- Add per-request `ToolRunnerHooks` tracking rolling tool-call signatures
- Halt loop after `loop_repeat_threshold` consecutive repeats (default 3)
- Collapse `max_iterations` on halt to terminate without further LLM call
- Append user-facing explanation to the turn with `stop_reason=endTurn`
- Expose `pallas_agent_loop_aborted_total{agent,reason}` counter
- Add per-agent `max_iterations` and `loop_repeat_threshold` config
- Document guard behavior, metric, and alerting query
2026-06-16 08:27:07 -04:00
ca7d714a31 docs(pallas): document sampling parameters and Prometheus metrics
Add two new sections to the Pallas documentation:

- Sampling parameters: explain that temperature/top_p/top_k are
  configured via the fast-agent decorator's `request_params`, with a
  provider support matrix and a note on Claude Opus 4.7 stripping these
  params in favor of `output_config.effort`.
- Metrics: document the Prometheus `/metrics` endpoint exposed on the
  registry port, including scrape config, full metrics reference table,
  and notes on where each metric is captured.
2026-05-23 07:49:21 -04:00
49da024877 docs: add Incidents & Lessons Learned section for Pallas<->Mnemosyne saga
Capture the five debugging chapters from the bearer-forwarding rollout
so the knowledge doesn't live only in chat history:

  1. Per-request bearer across anyio.TaskGroup boundary (ContextVar
     snapshot semantics, httpx auth-header caching on persistent
     connections, forward_inbound_auth pydantic-drop workaround).
  2. install() idempotency guard shadowing three newly-added
     monkey-patches — each patch now owns its own sentinel.
  3. FastMCP on_call_tool context shape: context.message.name, not
     context.message.params.name.  Extractor returning None silently
     killed the _PUBLIC_TOOLS bypass and downstream dispatch "await
     None(...)" produced the terse 'object NoneType can't be used in
     await expression' string that blocked Harper<->Mnemosyne.
  4. Rich-TUI corruption by DEBUG openai/sse_starlette/mcp via root
     logger inheriting logger.level=debug + our stderr StreamHandler.
     Fixed by PALLAS_LOG_STDERR gate and PALLAS_ROOT_LOG_LEVEL split.
  5. Current state table of PALLAS_LOG_* knobs + jq tail recipe.

Also add pallas.log and pallas._fastagent_patch to the Module Reference
table.
2026-05-07 06:32:24 -04:00
679a809f66 Fix bearer forwarding across anyio TaskGroup boundary
The Mnemosyne Authorization: Bearer token was being dropped on outbound MCP
calls because fast-agent runs downstream transports inside a long-lived
anyio TaskGroup whose context is snapshotted at manager startup —
request_bearer_token.get() inside _prepare_headers_and_auth therefore
always resolved to None even when the request handler had just set it.

Fix:
* pallas/_fastagent_patch.py
    - add _pending_bearers registry keyed by id(server_config) with a
      threading.Lock; publish_bearer / revoke_bearer helpers.
    - patched _prepare_headers_and_auth reads the registry first, falls
      back to the ContextVar for non-persistent probe paths.
    - emit INFO log on install() so the journal shows the patch ran;
      verbose flow logs at DEBUG on pallas.forward.

* pallas/multimodal_server.py
    - send_message resolves the agent's opted-in downstreams, publishes
      the inbound bearer for each, and revokes them all in the finally.
    - bearer/header diagnostics go to pallas.auth (DEBUG) instead of
      /tmp/pallas-bearer.log which is invisible under systemd PrivateTmp.

* pallas/log.py
    - honour PALLAS_LOG_LEVEL env var (default INFO) so operators can
      flip the forward/auth diagnostics on without a code change.

* docs/pallas.md, docs/mnemosyne_integration.md
    - document the registry-based forwarding and the task-group
      ContextVar constraint that forced it.
2026-05-05 12:09:51 -04:00
be71709608 feat(pallas): add opt-in bearer token forwarding to downstream MCP servers
Introduce per-server `forward_inbound_auth` flag that controls whether the
inbound MCP bearer token is propagated to outbound MCP transport calls.
Implemented as a fast-agent monkey-patch auto-installed on package import,
preventing accidental credential leakage to unrelated downstream servers.

Update docs to describe the two bearer token consumers (LLM provider
passthrough and opt-in downstream MCP forwarding) with a config example.
2026-05-03 17:17:50 -04:00
0cea5ece3a feat: add /healthz and /metrics endpoints, replace print with logging
- Add /healthz endpoint returning LLM provider validation status
- Add /metrics endpoint serving Prometheus metrics via prometheus_client
- Replace all print() calls in health.py with proper logging module
- Remove _PREFIX variable in favor of structured logger context
2026-04-10 11:22:26 +00:00