refactor: remove forward_inbound_auth, add traceback capture patches

Retire the per-turn bearer-token forwarding mechanism in favor of transparent authentication via operator-configured headers in fastagent.secrets.yaml. Agents now rely on long-lived team JWTs configured per downstream MCP server. Replace the token-forwarding patches with debug-only traceback-capture wrappers around three opaque fast-agent catch-sites that previously flattened exceptions to bare strings, making downstream transport errors diagnosable. Update README with authentication guidance and deprecation notice for the retired `forward_inbound_auth: true` flag (now silently ignored).
2026-05-10 14:46:39 -04:00
parent 49da024877
commit 2759c8428e
4 changed files with 94 additions and 490 deletions
--- a/README.md
+++ b/README.md
@@ -110,5 +110,43 @@ with fast-agent's `ModelDatabase`.
 |---|---|
 | `pallas.server` | CLI entry point and agent orchestration |
 | `pallas.registry` | `GET /.well-known/mcp/server.json` registry server |
-| `pallas.multimodal_server` | `MultimodalAgentMCPServer` — `AgentMCPServer` subclass with image support |
+| `pallas.multimodal_server` | `MultimodalAgentMCPServer` — `AgentMCPServer` subclass with image + history support |
 | `pallas.health` | LLM preflight validation + `get_health` MCP tool |
+| `pallas._fastagent_patch` | Traceback-capture wrappers around three opaque fast-agent catch-sites (debug-only) |
+
+---
+
+## Authentication
+
+Pallas is **transparent** to downstream authentication. Whatever the operator
+places under each downstream MCP server's `headers:` block in
+`fastagent.config.yaml` (typically loaded from `fastagent.secrets.yaml`) is what
+fast-agent sends — Pallas does not intercept, rewrite, or forward the inbound
+`Authorization` header of the MCP request that triggered the agent turn.
+
+For agents that talk to Mnemosyne, the convention is a long-lived team JWT
+minted from Mnemosyne's admin UI and pasted into the agent project's
+`fastagent.secrets.yaml`:
+
+```yaml
+mcp:
+  servers:
+    mnemosyne:
+      transport: http
+      url: https://mnemosyne.example.com/mcp/
+      headers:
+        Authorization: "Bearer eyJ…team-jwt…"
+```
+
+See
+[`mnemosyne/docs/DAEDALUS_PALLAS_INTEGRATION_v1.md`](https://git.helu.ca/r/mnemosyne/src/branch/main/docs/DAEDALUS_PALLAS_INTEGRATION_v1.md)
+for the three credential types Mnemosyne recognises, how team JWTs are
+minted and rotated, and the data model that ties a team to a set of
+libraries.
+
+> Earlier versions of Pallas shipped a `forward_inbound_auth: true`
+> mechanism that captured the per-turn `Authorization` header and
+> propagated it to opted-in downstream servers. That mechanism has been
+> retired — opt-in flags in old `fastagent.config.yaml` files are now
+> silently ignored and can be removed at your convenience.
+
--- a/pallas/_fastagent_patch.py
+++ b/pallas/_fastagent_patch.py
@@ -1,349 +1,50 @@
-"""Forward the inbound bearer token to opted-in downstream MCP servers.
+"""fast-agent runtime patches — traceback capture on three opaque catch-sites.

-fast-agent (≤0.6.19) captures the inbound ``Authorization: Bearer <X>`` into
-the ``request_bearer_token`` ContextVar, but does NOT propagate that value to
-outgoing MCP transport calls — ``_prepare_headers_and_auth`` only reads
-``server_config.headers``.  This module patches ``_prepare_headers_and_auth``
-so a downstream server marked ``forward_inbound_auth: true`` in
-``fastagent.config.yaml`` receives the same bearer the FastAgent itself was
-called with.
+fast-agent's transport layer catches every downstream-transport exception at
+several nesting levels, logs only ``str(exc)`` (no ``exc_info=True``), and
+re-raises.  By the time the exception surfaces to the MCP tool result, the
+traceback has been flattened to a bare string — the canonical symptom being
+``"object NoneType can't be used in 'await' expression"`` with no stack
+attached.  This module wraps three of those catch-sites so Pallas emits
+``logger.exception(...)`` with the full frame before fast-agent's swallowing
+``except`` runs.  Behaviour is otherwise unchanged: every wrapper re-raises
+the exception it caught.

-Opt-in is per-server because a FastAgent with multiple downstream MCP
-attachments (e.g. Mnemosyne + a public weather server) must not leak its
-credentials to every endpoint.
+The three wrapped entry points are:

-Why the simple "read from ``request_bearer_token``" approach does NOT work
----------------------------------------------------------------------------
-``MCPConnectionManager.launch_server`` spawns the server's transport task in
-``self._tg`` — a long-lived ``anyio.TaskGroup`` created at manager startup.
-``TaskGroup.start_soon`` copies the owning task's ``contextvars.Context`` at
-spawn time, which is the *startup* context, not the per-request context.
-The transport-preparation code therefore sees ``request_bearer_token.get()``
-as ``None`` even when the MCP request handler has just ``set`` it a few
-frames up.  Worse, ``launch_server`` runs once per downstream and the
-persistent connection is reused, so the very first request's (often
-empty) context is cached forever.
+1. ``MCPAgentClientSession.send_request`` — the lowest-level send call;
+2. ``MCPAgentClientSession.call_tool`` — the session-side wrapper around
+   meta merge, permission handling, progress callback factory, and the
+   send_request invocation itself;
+3. ``MCPAggregator._execute_on_server`` — the aggregator's setup around
+   the client call (server lookup, session factory, tracer span,
+   ``try_execute`` harness).

-The fix is to hand the bearer through the only object that *is* shared
-between the two tasks: the ``MCPServerSettings`` instance that both paths
-pass into ``_prepare_headers_and_auth``.  ``pallas.multimodal_server``
-registers the inbound bearer against ``id(server_config)`` in a
-process-wide registry for the duration of each MCP request; this patch
-reads it there and forges an ``Authorization`` header onto the outgoing
-transport.  Cleanup is guaranteed in the request handler's ``finally``.
+Any one wrapper being triggered while the other two stay silent pinpoints
+which frame is swallowing the exception, which is how we debug opaque
+transport failures.

-TODO: drop after the equivalent change lands in fast-agent upstream.
+Historical note: this file used to also carry the bearer-forwarding patch
+that propagated inbound ``Authorization`` headers to opted-in downstream
+MCP servers.  That mechanism was retired once Mnemosyne moved to static
+team JWTs carried in ``fastagent.config.yaml`` ``headers:`` entries — see
+``mnemosyne/docs/DAEDALUS_PALLAS_INTEGRATION_v1.md``.  Pallas is now
+transparent to auth: whatever the operator places in each downstream
+server's ``headers.Authorization`` is what fast-agent sends, full stop.
 """

 from __future__ import annotations

 import logging
-import os
-import threading
-from pathlib import Path
-from typing import Any

-import httpx
-
-from fast_agent.mcp import mcp_connection_manager as _mcm
-from fast_agent.mcp import mcp_agent_client_session as _macs
 from fast_agent.mcp import mcp_aggregator as _magg
-from fast_agent.mcp.auth.context import request_bearer_token
+from fast_agent.mcp import mcp_agent_client_session as _macs

 logger = logging.getLogger("pallas.forward")
 _trace_logger = logging.getLogger("pallas.forward.trace")


-class _DynamicBearerAuth(httpx.Auth):
-    """Per-request ``Authorization`` injection for persistent MCP connections.
-
-    fast-agent's ``create_mcp_http_client(headers=..., auth=...)`` snapshots
-    the ``headers`` dict at client construction time — every subsequent
-    ``tools/call`` reuses the *same* open connection, so a static
-    ``Authorization`` header set at handshake is the only one the downstream
-    server ever sees.  For workspace-scoped forwarding that's fatal: the
-    first request (often a startup probe) has no bearer, and every later
-    request that *does* carry a bearer inherits the probe's empty header.
-
-    httpx's ``auth`` parameter, however, is consulted on **every** outgoing
-    request via ``Auth.sync_auth_flow`` / ``async_auth_flow``.  We use that
-    to look up the current ``_pending_bearers`` entry for ``server_config``
-    and stamp ``Authorization`` onto each request individually — no stale
-    caching, no handshake/tool-call skew.
-    """
-
-    # Per-connection-reuse so ``httpx.AsyncClient`` can share us across
-    # streams; the lookup is keyed by ``id(server_config)`` so different
-    # servers (even same-named clones) stay isolated.
-    requires_request_body = False
-    requires_response_body = False
-
-    def __init__(self, server_config: Any) -> None:
-        self._server_config = server_config
-        self._server_name = getattr(server_config, "name", "?")
-
-    def _current_token(self) -> str | None:
-        return _lookup_bearer(self._server_config)
-
-    def _inject(self, request: httpx.Request) -> None:
-        token = self._current_token()
-        if token:
-            request.headers["Authorization"] = f"Bearer {token}"
-            logger.debug(
-                "forward.applied server=%s token_len=%d prefix=%s via=auth_flow",
-                self._server_name,
-                len(token),
-                token[:8],
-            )
-        else:
-            logger.debug(
-                "forward.skipped server=%s reason=no_inbound_bearer via=auth_flow",
-                self._server_name,
-            )
-
-    def auth_flow(self, request: httpx.Request):
-        # Both ``sync_auth_flow`` and ``async_auth_flow`` on httpx.Auth
-        # delegate to ``auth_flow`` when subclasses override only the
-        # generic path, which is exactly what we want: one implementation
-        # that works for both sync and async clients.  httpx drives this
-        # as a *plain* generator (the async side resolves the yielded
-        # request via its own await machinery), so do NOT mark this
-        # ``async def`` — that triggers
-        # ``object NoneType can't be used in 'await' expression``.
-        self._inject(request)
-        yield request
-
-# ── Opt-in server names discovered from raw YAML ──────────────────────────────
-# Fast-agent's ``Settings(**merged)`` pipeline silently discards unknown keys
-# on nested ``MCPServerSettings`` instances — even with ``extra="allow"`` set
-# on the parent and the model rebuilt — because ``nested_model_default_partial_update``
-# takes a path through ``model_construct`` that drops ``model_extra``.
-#
-# Rather than fight pydantic's nested-model plumbing, we parse the YAML
-# directly ourselves at patch-install time and build a set of server names
-# that carry ``forward_inbound_auth: true``.  The patched
-# ``_prepare_headers_and_auth`` looks up the name (stable and authoritative
-# regardless of Pydantic gymnastics) instead of asking the config object.
-_FORWARD_SERVERS: set[str] = set()
-
-_original_prepare = _mcm._prepare_headers_and_auth
-
-# ── Per-request bearer registry ──────────────────────────────────────────────
-# Keyed by ``id(server_config)`` so a request handler can publish the bearer
-# that applies to each opted-in downstream server.  The registry survives the
-# context-var-loss hop across anyio task groups because ``id()`` is stable and
-# the config object itself is held by fast-agent's ServerRegistry.
-#
-# A threading.Lock (not asyncio) is used because both the publishing side
-# (request handler) and the reading side (``_prepare_headers_and_auth``, run
-# inside the connection manager's task group) may execute on different anyio
-# worker threads under uvicorn's default thread-portal setup.  Access is
-# microsecond-scoped — no contention concerns.
-_pending_bearers: dict[int, str] = {}
-_pending_lock = threading.Lock()
-
-
-def publish_bearer(server_config: Any, token: str) -> None:
-    """Register ``token`` as the inbound bearer to forward to this server.
-
-    Called by ``pallas.multimodal_server.send_message`` for every downstream
-    whose config carries ``forward_inbound_auth: true``.  Must be paired with
-    ``revoke_bearer`` in the same ``try/finally``.
-    """
-    if not token:
-        return
-    with _pending_lock:
-        _pending_bearers[id(server_config)] = token
-    logger.debug(
-        "forward.published server=%s token_len=%d prefix=%s",
-        getattr(server_config, "name", "?"),
-        len(token),
-        token[:8],
-    )
-
-
-def revoke_bearer(server_config: Any) -> None:
-    """Clear any bearer previously published for ``server_config``.
-
-    Always safe to call — a missing key is silently ignored, so request
-    handlers can ``finally: revoke_bearer(cfg)`` without pre-checks.
-    """
-    with _pending_lock:
-        _pending_bearers.pop(id(server_config), None)
-    logger.debug(
-        "forward.revoked server=%s",
-        getattr(server_config, "name", "?"),
-    )
-
-
-def _lookup_bearer(server_config: Any) -> str | None:
-    """Resolve the bearer to forward for ``server_config``.
-
-    Tries the per-request registry first (works across task groups) and
-    falls back to the ContextVar for cases where the caller lives in the
-    same task (e.g. fast-agent's own non-persistent probe path).
-    """
-    with _pending_lock:
-        token = _pending_bearers.get(id(server_config))
-    if token:
-        return token
-    try:
-        return request_bearer_token.get()
-    except LookupError:
-        return None
-
-
-def _prepare_headers_and_auth_with_forward(server_config, **kwargs):
-    headers, oauth_auth, user_auth_keys = _original_prepare(server_config, **kwargs)
-
-    server_name = getattr(server_config, "name", None) or "?"
-    forward_flag = server_name in _FORWARD_SERVERS
-
-    logger.debug(
-        "forward.check server=%s forward_inbound_auth=%s",
-        server_name,
-        forward_flag,
-    )
-
-    if not forward_flag:
-        return headers, oauth_auth, user_auth_keys
-
-    if user_auth_keys:
-        logger.debug(
-            "forward.skipped server=%s reason=user_auth_present keys=%s",
-            server_name,
-            sorted(user_auth_keys),
-        )
-        return headers, oauth_auth, user_auth_keys
-
-    if oauth_auth is not None:
-        logger.debug(
-            "forward.skipped server=%s reason=oauth_active",
-            server_name,
-        )
-        return headers, oauth_auth, user_auth_keys
-
-    # Install a dynamic ``httpx.Auth`` instead of baking a static header into
-    # the returned ``headers`` dict.  Fast-agent passes the auth object to
-    # ``create_mcp_http_client(auth=...)`` which forwards to
-    # ``httpx.AsyncClient(auth=...)``; httpx then consults it on every
-    # outgoing request via ``async_auth_flow``, reading the *current*
-    # ``_pending_bearers`` entry.
-    #
-    # This dodges the fatal "first handshake wins forever" problem:
-    # persistent MCP connections reuse the open socket across hundreds of
-    # tool-call requests, but the auth flow re-runs per request, so we can
-    # stamp the correct per-turn bearer onto each ``tools/call`` even though
-    # the initial ``initialize`` ran with no bearer at startup.
-    #
-    # We also report it through ``user_auth_keys`` so OAuth scrubbing (see
-    # ``_prepare_headers_and_auth`` upstream) treats Authorization as
-    # caller-owned and doesn't try to kick off an OAuth flow.
-    auth = _DynamicBearerAuth(server_config)
-    user_auth_keys = set(user_auth_keys) | {"Authorization"}
-    logger.debug(
-        "forward.bound server=%s auth=%s",
-        server_name,
-        type(auth).__name__,
-    )
-    # Current token may or may not be set — we don't require one at bind
-    # time because the auth flow will resolve it per-request; logging a
-    # preview when available helps trace the startup probe path.
-    inbound = _lookup_bearer(server_config)
-    if inbound:
-        logger.debug(
-            "forward.applied server=%s token_len=%d prefix=%s via=bind",
-            server_name,
-            len(inbound),
-            inbound[:8],
-        )
-    return headers, auth, user_auth_keys
-
-
-def _candidate_config_paths() -> list[Path]:
-    """Paths to scan for ``fastagent.config.yaml``.
-
-    Order matters: the first existing file wins.  We mirror fast-agent's
-    ``find_config`` discovery rule (cwd then ancestors) and additionally
-    honour a ``FASTAGENT_CONFIG_PATH`` override so tests / ansible-managed
-    deployments can point Pallas at a specific file.
-    """
-    override = os.environ.get("FASTAGENT_CONFIG_PATH")
-    if override:
-        return [Path(override).expanduser()]
-
-    paths: list[Path] = []
-    cwd = Path.cwd()
-    for p in (cwd, *cwd.parents):
-        paths.append(p / "fastagent.config.yaml")
-    return paths
-
-
-def _refresh_forward_servers() -> None:
-    """Populate ``_FORWARD_SERVERS`` from the raw YAML config.
-
-    Parses the YAML ourselves (bypassing fast-agent's ``Settings`` pipeline)
-    because nested pydantic validation silently drops unknown keys on
-    ``MCPServerSettings`` — so by the time we'd see the config object,
-    ``forward_inbound_auth`` is gone.
-
-    Called both at ``install()`` time and lazily from
-    ``_prepare_headers_and_auth_with_forward`` so hot-reloaded configs or
-    late-bound working directories still work.  Failure is non-fatal: we
-    simply log and leave ``_FORWARD_SERVERS`` unchanged.
-    """
-    try:
-        import yaml
-    except ImportError:
-        logger.warning("pyyaml not available; cannot scan forward_inbound_auth opt-ins")
-        return
-
-    for path in _candidate_config_paths():
-        if not path.exists():
-            continue
-        try:
-            with open(path) as fh:
-                data = yaml.safe_load(fh) or {}
-        except Exception as exc:
-            logger.warning("forward.config_parse_failed path=%s error=%s", path, exc)
-            continue
-
-        servers = (data.get("mcp") or {}).get("servers") or {}
-        if not isinstance(servers, dict):
-            return
-
-        names: set[str] = set()
-        for server_name, server_cfg in servers.items():
-            if not isinstance(server_cfg, dict):
-                continue
-            if server_cfg.get("forward_inbound_auth"):
-                names.add(server_name)
-
-        if names != _FORWARD_SERVERS:
-            _FORWARD_SERVERS.clear()
-            _FORWARD_SERVERS.update(names)
-            logger.info(
-                "forward.opt_in servers=%s source=%s",
-                sorted(_FORWARD_SERVERS),
-                path,
-            )
-        return
-
-    logger.debug("forward.no_config_found searched=%s", _candidate_config_paths())
-
-
 # ── send_request traceback capture ───────────────────────────────────────────
-# fast-agent's ``MCPAgentClientSession.send_request`` catches every
-# downstream-transport exception, logs ``"send_request failed: <str(e)>"``
-# *without* ``exc_info=True``, and re-raises — which means the exception
-# propagates up to the agent loop where it is serialised as a tool result
-# string (``"object NoneType can't be used in 'await' expression"`` is the
-# canonical symptom) with no traceback anywhere.
-#
-# We wrap ``send_request`` so Pallas can emit ``logger.exception(...)`` with
-# the full stack before re-raising.  The original logger still fires its
-# one-line summary; our wrapper adds the frames next to it in pallas.log.
-# No behavioural change — we re-raise the same exception.
 _original_send_request = _macs.MCPAgentClientSession.send_request


@@ -373,30 +74,7 @@ def _patch_send_request() -> None:
    logger.info("send_request traceback-capture patch installed")


-# ── call_tool / _execute_on_server traceback capture ─────────────────────────
-# The "object NoneType can't be used in 'await' expression" error surfaces
-# via ``EnrichedMCPToolProgressManager.on_tool_complete`` (message=error),
-# which is driven by ``MCPAggregator`` at line 2287 catching a generic
-# ``Exception`` and passing ``str(e)`` downstream.  The ``send_request``
-# wrapper above proved — by its silence — that the exception is NOT raised
-# inside ``send_request``, so the failing ``await X()`` (where X returns
-# ``None``) must live in one of the frames between:
-#   * ``MCPAgentClientSession.call_tool``                 (override, ~985)
-#   * ``MCPAggregator._execute_on_server.try_execute``    (~1612)
-#   * anything between call_tool and send_request (``_merge_experimental_session_meta``,
-#     the permission handler, the progress-callback factory, span creation, …)
-#
-# We install two outer wrappers to triangulate:
-#   1. ``MCPAgentClientSession.call_tool`` — catches anything raised in the
-#      session's override (meta merge, params construction, send_request invocation
-#      itself, ...);
-#   2. ``MCPAggregator._execute_on_server`` — catches everything the aggregator
-#      sets up around the client call (get_server, session factory, permission
-#      check, tracer span, progress callback, ``try_execute`` itself).
-#
-# Both emit ``logger.exception(...)`` (full stack) before re-raising; the
-# original control flow is untouched.  Once the offending frame is identified
-# from the resulting traceback, these wrappers can be removed.
+# ── call_tool traceback capture ──────────────────────────────────────────────
 _original_session_call_tool = _macs.MCPAgentClientSession.call_tool


@@ -425,6 +103,7 @@ def _patch_session_call_tool() -> None:
    logger.info("session.call_tool traceback-capture patch installed")


+# ── _execute_on_server traceback capture ─────────────────────────────────────
 _original_execute_on_server = _magg.MCPAggregator._execute_on_server


@@ -464,25 +143,13 @@ def _patch_execute_on_server() -> None:


 def install() -> None:
-    # NOTE: we do NOT short-circuit on "already patched" at the top of this
-    # function — each individual ``_patch_*`` helper owns its own idempotency
-    # guard, and we want all three trace-capture patches to be applied even
-    # when the bearer-forwarding patch was installed in a previous reload.
-    # Previously a top-level guard on ``_prepare_headers_and_auth`` would
-    # return immediately on a reinstall, leaving the trace wrappers missing
-    # silently — which is exactly the failure we chased.
-    if not getattr(
-        _mcm._prepare_headers_and_auth, "_pallas_forward_patched", False
-    ):
-        _refresh_forward_servers()
-        _prepare_headers_and_auth_with_forward._pallas_forward_patched = True  # type: ignore[attr-defined]
-        _mcm._prepare_headers_and_auth = _prepare_headers_and_auth_with_forward
-        # INFO so it always appears in the journal at boot — greppable proof
-        # that the patch ran before any agent started.
-        logger.info(
-            "bearer-forwarding patch installed "
-            "(forward_inbound_auth-aware _prepare_headers_and_auth)"
-        )
+    """Install all three trace-capture wrappers.
+
+    Each ``_patch_*`` helper is individually idempotent (guarded on a
+    ``_pallas_trace_patched`` attribute), so ``install()`` is safe to call
+    repeatedly — e.g. from ``pallas/__init__.py`` on import + again from
+    a test harness — without stacking wrappers.
+    """
    _patch_send_request()
    _patch_session_call_tool()
    _patch_execute_on_server()
--- a/pallas/log.py
+++ b/pallas/log.py
@@ -34,7 +34,9 @@ class _JSONFormatter(logging.Formatter):
    ``traceback`` field.  Without this, every ``logger.error`` in
    fast-agent / fastmcp / the MCP SDK loses its stack trace and we end
    up guessing from the single-line message — which is exactly the
-    rabbit hole we spent hours in during the bearer-forwarding debug.
+    rabbit hole we spent hours in chasing opaque MCP transport failures
+    (see ``pallas._fastagent_patch`` for the trace-capture wrappers that
+    grew out of the same debugging session).

    Also pulls in ``record.exc_text`` if the formatter was already
    populated upstream (e.g. by another handler), avoiding duplicate
@@ -126,7 +128,7 @@ def _resolve_log_file() -> Path:
    The parent directory is created lazily (``mkdir -p``) so a fresh host
    doesn't need any prep work.  We avoid ``/tmp`` because systemd's
    ``PrivateTmp=yes`` makes it per-unit-invisible — learned the hard way
-    during the bearer-forwarding debug saga.
+    during the MCP-transport debug saga.
    """
    override = os.environ.get("PALLAS_LOG_FILE")
    if override:
@@ -143,13 +145,13 @@ def setup_logging() -> None:

      1. ``PALLAS_LOG_LEVEL`` environment variable — explicit override.
      2. ``logger.level`` in ``fastagent.config.yaml`` — unified control knob
-         so bumping fast-agent's level also flips on Pallas's bearer-forwarding
+         so bumping fast-agent's level also flips on Pallas's own
         diagnostics.
      3. ``INFO``.

-    ``DEBUG`` unlocks diagnostics on the ``pallas.forward`` and
-    ``pallas.auth`` loggers — essential when troubleshooting Mnemosyne /
-    workspace-scoped agent calls.
+    ``DEBUG`` unlocks traceback capture on the ``pallas.forward.trace``
+    logger (see ``pallas._fastagent_patch``) — essential when
+    troubleshooting opaque MCP transport failures.

    Scope of handlers:

--- a/pallas/multimodal_server.py
+++ b/pallas/multimodal_server.py
@@ -19,17 +19,14 @@ fast-agent instance whose ``message_history`` is seeded from the caller's
 memory, no restart amnesia.
 """

-import logging
 import time
 from typing import Any

 import fast_agent.core.prompt
 from fast_agent.core.logging.logger import get_logger
-from fast_agent.mcp.auth.context import request_bearer_token
 from fast_agent.mcp.server import AgentMCPServer
 from fast_agent.types import PromptMessageExtended, RequestParams

-from pallas._fastagent_patch import _FORWARD_SERVERS, publish_bearer, revoke_bearer
 from pallas.progress import EnrichedMCPToolProgressManager
 from fastmcp import Context as MCPContext
 from fastmcp.prompts import Message
@@ -39,79 +36,6 @@ from starlette.responses import JSONResponse, Response

 logger = get_logger(__name__)

-# Separate stdlib logger for bearer-token diagnostics — routed through
-# ``pallas.log`` JSON handler to stdout / systemd journal.  Gated at DEBUG so
-# it is off by default in production but trivially flipped on via
-# ``PALLAS_LOG_LEVEL=DEBUG`` for troubleshooting agent auth issues.
-_auth_log = logging.getLogger("pallas.auth")
-
-
-def _get_request_bearer_token() -> str | None:
-    """Return the raw bearer token from the current MCP request's Authorization header.
-
-    Reads the header directly rather than going through get_access_token() because
-    Pallas runs without FastMCP auth middleware — there is no AuthenticatedUser in
-    the request scope, so get_access_token() always returns None here. The token
-    is an opaque string forwarded to opted-in downstream servers by
-    ``pallas._fastagent_patch``.
-    """
-    try:
-        from fastmcp.server.dependencies import get_http_request
-
-        request = get_http_request()
-        auth = request.headers.get("authorization", "")
-        if auth.lower().startswith("bearer "):
-            token = auth[7:]
-            _auth_log.debug(
-                "bearer.captured len=%d prefix=%s", len(token), token[:8]
-            )
-            return token
-        _auth_log.debug("bearer.absent has_auth=%s", bool(auth))
-    except Exception as exc:
-        _auth_log.debug("bearer.error %s", exc)
-    return None
-
-
-def _forwardable_server_configs(agent) -> list[Any]:
-    """Return the ``MCPServerSettings`` objects the agent is entitled to and
-    which are listed in ``_FORWARD_SERVERS`` (opt-in via
-    ``forward_inbound_auth: true`` in ``fastagent.config.yaml``).
-
-    Restricting to the intersection of (agent-attached-servers ∩
-    opted-in-servers) ensures a request never publishes its bearer against
-    a server the calling agent does not use — e.g. a Harper→Mnemosyne call
-    must not flag Scotty→Mnemosyne's config.
-
-    We read the opt-in set from ``pallas._fastagent_patch._FORWARD_SERVERS``
-    rather than a pydantic attribute on the config object because fast-agent's
-    ``Settings(**merged)`` validation silently drops unknown keys on nested
-    ``MCPServerSettings`` instances (see ``_fastagent_patch._refresh_forward_servers``
-    for the gory details).
-
-    Safe to call before the agent is constructed: returns an empty list if
-    any attribute lookup fails.
-    """
-    try:
-        agent_servers = set(getattr(agent.config, "servers", []) or [])
-        if not agent_servers:
-            return []
-        opt_in = agent_servers & _FORWARD_SERVERS
-        if not opt_in:
-            return []
-        registry = agent.context.server_registry
-        if registry is None:
-            return []
-        configs: list[Any] = []
-        for name in opt_in:
-            cfg = registry.registry.get(name)
-            if cfg is not None:
-                configs.append(cfg)
-        return configs
-    except Exception as exc:
-        _auth_log.debug("bearer.registry_lookup_failed %s", exc)
-        return []
-
-

 def _history_to_fastmcp_messages(
    message_history: list[PromptMessageExtended],
@@ -277,34 +201,15 @@ class MultimodalAgentMCPServer(AgentMCPServer):
                Optional opaque identifier, logged for trace correlation.
                Pallas does not interpret it.
            """
-            inbound_bearer = _get_request_bearer_token()
-            saved_token = request_bearer_token.set(inbound_bearer)
            report_progress = self._build_progress_reporter(ctx)
            request_params = RequestParams(
                tool_execution_handler=EnrichedMCPToolProgressManager(report_progress),
                emit_loop_progress=True,
            )
-            # Track which downstream server configs we publish the bearer
-            # against so the ``finally`` block below can revoke every one of
-            # them even if the agent send raises halfway through.
-            published_configs: list[Any] = []
-            try:
            instance = await self._acquire_instance(ctx)
            agent = instance.app[agent_name]
            agent_context = getattr(agent, "context", None)
-
-                # Register the inbound bearer against each downstream server
-                # config the agent is allowed to reach and which opts-in via
-                # ``forward_inbound_auth: true``.  This is how the bearer
-                # crosses the anyio task-group boundary that ContextVars
-                # cannot hop — see ``pallas._fastagent_patch`` for the
-                # full explanation.
-                if inbound_bearer:
-                    for srv_cfg in _forwardable_server_configs(agent):
-                        publish_bearer(srv_cfg, inbound_bearer)
-                        published_configs.append(srv_cfg)
-
-
+            try:
                # Seed the freshly-created instance's message_history from the
                # caller-supplied history so the agent sees the full
                # conversation the caller is tracking.  Safe no-op when the
@@ -359,7 +264,6 @@ class MultimodalAgentMCPServer(AgentMCPServer):
                    )
                    return response

-                try:
                if agent_context and ctx:
                    return await self.with_bridged_context(
                        agent_context, ctx, execute_send
@@ -367,13 +271,6 @@ class MultimodalAgentMCPServer(AgentMCPServer):
                return await execute_send()
            finally:
                await self._release_instance(ctx, instance)
-            finally:
-                # Always revoke every bearer we published, then restore the
-                # ContextVar — order matters only for tidiness; a revoke that
-                # finds nothing is a no-op.
-                for srv_cfg in published_configs:
-                    revoke_bearer(srv_cfg)
-                request_bearer_token.reset(saved_token)


        if self._instance_scope == "request":