Commit Graph

28 Commits

Author SHA1 Message Date
f92f3fc710 feat(log): add service/project/component labels for Loki
Inject stable JSON fields (`service`, `project`, `component`) on every
log record via a `logging.Filter` so Alloy-forwarded logs share the
same label shape as Mnemosyne/Daedalus. The `project` label is set
once from `agents.yaml`, and `component` is stored in a ContextVar
so each agent's asyncio task carries its own value without leaking
across sibling agents.

Also switch the opt-in console sink from `sys.__stderr__` to
`sys.__stdout__` (`PALLAS_LOG_STDOUT=1`) for cleaner journald capture
in systemd deployments.
2026-05-11 13:52:25 -04:00
2759c8428e refactor: remove forward_inbound_auth, add traceback capture patches
Retire the per-turn bearer-token forwarding mechanism in favor of
transparent authentication via operator-configured headers in
fastagent.secrets.yaml. Agents now rely on long-lived team JWTs configured
per downstream MCP server.

Replace the token-forwarding patches with debug-only traceback-capture
wrappers around three opaque fast-agent catch-sites that previously
flattened exceptions to bare strings, making downstream transport errors
diagnosable.

Update README with authentication guidance and deprecation notice for
the retired `forward_inbound_auth: true` flag (now silently ignored).
2026-05-10 14:46:39 -04:00
49da024877 docs: add Incidents & Lessons Learned section for Pallas<->Mnemosyne saga
Capture the five debugging chapters from the bearer-forwarding rollout
so the knowledge doesn't live only in chat history:

  1. Per-request bearer across anyio.TaskGroup boundary (ContextVar
     snapshot semantics, httpx auth-header caching on persistent
     connections, forward_inbound_auth pydantic-drop workaround).
  2. install() idempotency guard shadowing three newly-added
     monkey-patches — each patch now owns its own sentinel.
  3. FastMCP on_call_tool context shape: context.message.name, not
     context.message.params.name.  Extractor returning None silently
     killed the _PUBLIC_TOOLS bypass and downstream dispatch "await
     None(...)" produced the terse 'object NoneType can't be used in
     await expression' string that blocked Harper<->Mnemosyne.
  4. Rich-TUI corruption by DEBUG openai/sse_starlette/mcp via root
     logger inheriting logger.level=debug + our stderr StreamHandler.
     Fixed by PALLAS_LOG_STDERR gate and PALLAS_ROOT_LOG_LEVEL split.
  5. Current state table of PALLAS_LOG_* knobs + jq tail recipe.

Also add pallas.log and pallas._fastagent_patch to the Module Reference
table.
2026-05-07 06:32:24 -04:00
89870f4bdc log: decouple root level from Pallas level; dial noisy libs to WARNING
When Pallas runs with logger.level=debug (as Kottos does during the
bearer-forwarding shakedown), setting the root logger to DEBUG opens
the floodgates for every third-party library to emit DEBUG records.
openai._base_client, sse_starlette.sse, mcp, and anthropic each log
one line per HTTP request / SSE chunk / JSON-RPC frame; with fast-agent's
logger.type=console handler attached to root those lines splatter into
the Rich TUI and make the chat unusable.

Split the two knobs:
  * PALLAS_LOG_LEVEL (or fastagent.config logger.level) — drives the
    pallas.* loggers + file sink.  Unchanged.
  * Root logger level — defaults to the higher of (level, INFO) so
    third-party DEBUG never bleeds through by default.  PALLAS_ROOT_LOG_LEVEL
    overrides for operators who genuinely want everything at DEBUG.

Also extend the noisy-logger list so openai/anthropic/sse_starlette/mcp
are individually pinned at WARNING regardless of root — belt-and-braces
for the common case.
2026-05-06 20:34:28 -04:00
dde7d4fa30 log: gate stderr handler behind PALLAS_LOG_STDERR so fast-agent TUI is usable
The stderr StreamHandler, even using sys.__stderr__ captured before
Rich installed its Live display, still corrupts the fast-agent TUI in
interactive 'fast-agent go' sessions — Rich redraws on top of our JSON
log lines but leaks through every repaint, making the interface
effectively unusable.

Keep the RotatingFileHandler as the always-on durable capture (that's
what survives fast-agent's progress_display takeover and what we rely
on for diagnostics).  Gate the stderr sink behind PALLAS_LOG_STDERR=1
for operators who explicitly want journal/terminal capture on a
systemd-managed deployment.
2026-05-06 20:06:17 -04:00
082b6111ae install(): apply trace-capture patches even on reinstall
The previous install() short-circuited at the top when
_prepare_headers_and_auth was already wrapped, which left the newly
added _patch_send_request / _patch_session_call_tool /
_patch_execute_on_server helpers unexecuted on any reinstall.  That
explained why the trace-capture INFO lines never appeared in
pallas.log despite the installed _fastagent_patch.py carrying the
new code.

Restructure install() so the bearer-forwarding block owns its own
idempotency guard inline, while the three _patch_* helpers are
always invoked — each already has its own 'already patched' guard
on the target method, so redundant calls are free and harmless.
2026-05-06 19:28:40 -04:00
273b96b370 Add call_tool & _execute_on_server traceback-capture monkeypatches
The send_request wrapper (56a1cd0) never fires in pallas.log for the
NoneType-await failures, proving the offending await lives above
send_request in the call stack. Install two additional wrappers to
triangulate:

  * MCPAgentClientSession.call_tool — catches failures in the session's
    override (meta merge, params, send_request invocation itself, ...).
  * MCPAggregator._execute_on_server — catches the broadest surface:
    get_server, session factory, permission check, tracer span,
    progress callback, try_execute wrapper.

Both emit logger.exception(...) with full stack before re-raising;
control flow is otherwise untouched.  Removable once the offending
frame is identified from the resulting traceback.
2026-05-06 18:54:44 -04:00
56a1cd0a6c forward: capture send_request tracebacks before fast-agent drops them
fast-agent's MCPAgentClientSession.send_request catches every downstream
transport exception, logs the one-line 'send_request failed: <str(e)>'
WITHOUT exc_info=True, then re-raises.  The exception then propagates
up to the agent loop where its message is serialised as the tool result
string ('object NoneType can't be used in an await expression' being
the canonical symptom) and the traceback is lost forever.

Wrap send_request so Pallas emits logger.exception() with the full
stack against the 'pallas.forward.trace' logger before re-raising.
No behavioural change — we re-raise the same exception; we just get
one extra log record with the frames attached, which pallas.log now
preserves thanks to the _JSONFormatter traceback field.

This will surface the real origin of the NoneType-await that's
currently being served as Harper's mnemosyne tool result even though
Mnemosyne itself returns 200 OK.
2026-05-06 06:11:00 -04:00
ac4af942ab log: route third-party + traceback records through the pallas log
The existing setup only attached the file/stderr handlers to the
'pallas' namespace, so every record emitted by fast-agent, fastmcp,
the MCP SDK, Anthropic, uvicorn etc. disappeared into Rich's progress
display and never hit pallas.log.  When one of those libraries raised
and logged 'something failed' via logger.error(..., exc_info=True),
we ended up grepping a Rich-overwritten TTY for a traceback that was
already long gone -- exactly the situation blocking the current
Mnemosyne debug.

This patch:

* Extends _JSONFormatter to serialise exc_info/stack_info as a
  'traceback' field when present, so Loki/grep sees the full stack.
* Attaches the same file+stderr handlers to the *root* logger so
  every library's records (and any uncaught logger.error tracebacks)
  land in pallas.log with the stack attached.
* Keeps the 'pallas' logger's own handlers (propagate=False) so our
  records are unaffected by any later root-handler manipulation.
* Tags our handlers with _pallas_attached so repeated setup_logging()
  calls are idempotent -- important because uvicorn workers and
  fast-agent subagent subprocesses each reinitialise logging.

httpx/httpcore stay at WARNING so we don't flood the log with per-
request body traces on a DEBUG deployment.  Demote third-party
namespaces further in a follow-up if needed.
2026-05-05 22:46:14 -04:00
66b5dd7bdd forward: override auth_flow (generic) instead of async_auth_flow
The async_auth_flow override was being driven via 'await' in httpx's
async dispatcher, which yielded 'NoneType is not awaitable' because a
plain generator yielding a Request doesn't produce an awaitable.

httpx.Auth has three hooks: sync_auth_flow, async_auth_flow, and the
generic auth_flow.  The default sync/async implementations delegate to
auth_flow when subclasses override only that one, which is exactly the
behaviour we want: one plain-generator implementation shared across
sync and async clients.  Override auth_flow, drop sync/async overrides.
2026-05-05 21:04:45 -04:00
f634cc55d8 forward: use httpx.Auth so per-turn bearer survives persistent MCP connections
The previous static-header approach only ran at handshake time, and
persistent MCP connections reuse the open socket for every subsequent
tools/call.  The first startup probe had no bearer, so every later
tool call inherited an empty Authorization header — Mnemosyne saw
no credentials and returned 'Authentication required'.

Fix: swap the static header for a _DynamicBearerAuth(httpx.Auth) that
httpx consults per-request via async_auth_flow.  We look up the current
_pending_bearers entry for this server_config and stamp Authorization
on each outgoing request individually — no stale caching, no
handshake/tool-call skew.

Verified chain now runs:
  bearer.captured  (inbound)
  forward.published (registry key)
  forward.bound     (auth object installed at connect time)
  forward.applied   (stamped per request via async_auth_flow)
2026-05-05 20:57:06 -04:00
711f54395d forward: scan YAML directly for forward_inbound_auth opt-ins
Root-cause:  fast-agent's Settings(**merged_settings) validation pipeline
silently drops unknown keys on nested MCPServerSettings instances — even
after flipping extra='allow' and calling model_rebuild(force=True).  The
culprit is Settings(nested_model_default_partial_update=True) which takes
a model_construct path that discards model_extra on the nested model.

Verified live: MCPServerSettings.model_validate({'forward_inbound_auth': True})
preserves the field (model_extra={'forward_inbound_auth': True}), but
get_settings().mcp.servers['mnemosyne'] returns an instance where the
attribute is MISSING and model_extra is None.

Fix: parse fastagent.config.yaml ourselves at patch-install time and
record the set of opted-in server names in _FORWARD_SERVERS.  The patch
and multimodal_server's forwardable-config resolver both key off the
server name — stable, authoritative, and completely sidesteps Pydantic's
extras handling.
2026-05-05 14:59:01 -04:00
541b59b4e3 log: add RotatingFileHandler so DEBUG output survives Rich's console takeover
fast-agent's progress_display installs a Rich Live renderer on stdout/stderr;
plain StreamHandler records get swallowed mid-render, making the bearer-
forwarding DEBUG logs invisible on the console.

Route every pallas.* record to two sinks:

  1. ~/.local/state/pallas/pallas.log (rotating, 10MiB x5) — durable capture
     regardless of who owns the TTY.  Overridable via PALLAS_LOG_FILE.
  2. sys.__stderr__ — the original stderr FD captured before Rich could grab
     it, so records still reach the TTY / journal when DEBUG is on.

Avoids /tmp deliberately: systemd PrivateTmp=yes made /tmp/pallas-bearer.log
invisible during the original debug saga.
2026-05-05 14:32:27 -04:00
7932c72660 log: honour fastagent.config.yaml logger.level so one knob controls Pallas + fast-agent 2026-05-05 14:25:41 -04:00
679a809f66 Fix bearer forwarding across anyio TaskGroup boundary
The Mnemosyne Authorization: Bearer token was being dropped on outbound MCP
calls because fast-agent runs downstream transports inside a long-lived
anyio TaskGroup whose context is snapshotted at manager startup —
request_bearer_token.get() inside _prepare_headers_and_auth therefore
always resolved to None even when the request handler had just set it.

Fix:
* pallas/_fastagent_patch.py
    - add _pending_bearers registry keyed by id(server_config) with a
      threading.Lock; publish_bearer / revoke_bearer helpers.
    - patched _prepare_headers_and_auth reads the registry first, falls
      back to the ContextVar for non-persistent probe paths.
    - emit INFO log on install() so the journal shows the patch ran;
      verbose flow logs at DEBUG on pallas.forward.

* pallas/multimodal_server.py
    - send_message resolves the agent's opted-in downstreams, publishes
      the inbound bearer for each, and revokes them all in the finally.
    - bearer/header diagnostics go to pallas.auth (DEBUG) instead of
      /tmp/pallas-bearer.log which is invisible under systemd PrivateTmp.

* pallas/log.py
    - honour PALLAS_LOG_LEVEL env var (default INFO) so operators can
      flip the forward/auth diagnostics on without a code change.

* docs/pallas.md, docs/mnemosyne_integration.md
    - document the registry-based forwarding and the task-group
      ContextVar constraint that forced it.
2026-05-05 12:09:51 -04:00
24c7374f3d chore(diagnostics): switch bearer token logging to file-based diag log
Replace stdlib logger calls for inbound bearer token capture and forward
decisions with a `_diag_write` helper that appends to
`/tmp/pallas-bearer.log`. This ensures diagnostic output is reliably
captured regardless of logger configuration, while swallowing any write
errors to avoid impacting request handling.
2026-05-05 06:51:13 -04:00
0435f97706 chore(logging): use stdlib logger with plain format strings for auth forwarding 2026-05-04 21:50:57 -04:00
68b486d62a chore(logging): add diagnostic logs for inbound auth forwarding
Add info-level logging to trace bearer token capture and forwarding
through fastagent, including token length/prefix and reasons for
skipping forward (existing user auth, oauth, or missing inbound token).
Also log warnings on bearer extraction errors instead of silently
swallowing exceptions.
2026-05-04 21:20:49 -04:00
e7f1e044b7 fix(pallas): read bearer token from raw Authorization header
get_access_token() requires FastMCP auth middleware to populate
AuthenticatedUser in the request scope — Pallas runs without auth
middleware so it always returned None. Read the Authorization header
directly from the ASGI request instead.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-04 18:18:50 -04:00
705b4f8cbe Docs: update 2026-05-04 15:34:51 -04:00
be71709608 feat(pallas): add opt-in bearer token forwarding to downstream MCP servers
Introduce per-server `forward_inbound_auth` flag that controls whether the
inbound MCP bearer token is propagated to outbound MCP transport calls.
Implemented as a fast-agent monkey-patch auto-installed on package import,
preventing accidental credential leakage to unrelated downstream servers.

Update docs to describe the two bearer token consumers (LLM provider
passthrough and opt-in downstream MCP forwarding) with a config example.
2026-05-03 17:17:50 -04:00
95fa6e6fc0 feat!: stateless per-request agents; add history + conversation_id to send_message
Make Pallas truly stateless per the 'Pallas is ephemeral' contract.

BREAKING (behavioural, not API):
  * instance_scope changes from 'shared' to 'request' in pallas.server.
    Each MCP tools/call now acquires a freshly-created fast-agent instance
    via the existing create_instance / dispose_instance factories and
    disposes it immediately after the response.

With 'shared' mode:
  * Every MCP caller saw the same agent.message_history, so different
    Daedalus conversations leaked into each other.
  * Mid-chat context was silently truncated once the model window filled.
  * Restarting the Pallas process wiped all in-flight conversation state,
    even though Daedalus had it persisted in Postgres.

With 'request' mode the Pallas process holds no per-conversation state;
the caller (Daedalus) owns history and reseeds it on every turn.

send_message gains two optional arguments:
  * history: list[{role, content, images?}] in chronological order,
    converted to PromptMessageExtended and seeded onto the fresh
    instance's message_history before agent.send().
  * conversation_id: opaque string, logged for trace correlation only —
    Pallas never interprets or persists it.

Malformed history entries (bad role, missing image data/mime_type, etc.)
are skipped with a warning rather than raising, so a single bad row
cannot wipe a whole conversation.

The {agent}_history MCP prompt is still registered under 'request'
scope for backward compatibility but always returns []; history lives
on the client.

Version bumped to 0.2.0.
2026-04-27 08:16:59 -04:00
a5b4650dff feat(log): suppress successful MCP access logs in health filter
Extend `_HealthAccessFilter` to also drop uvicorn access log lines for
successful `POST /mcp` requests, in addition to the existing
`/live`, `/ready`, and `/metrics` health probes.

**Why:** Every Daedalus health poll and tool call hits the single `/mcp`
route. Pallas already emits structured `mcp_request_start` /
`mcp_request_complete` logs at the agent layer, making the uvicorn
access line pure duplication and noise in syslog.

**How:**
- Replace the simple substring list `_HEALTH_PATHS` with compiled regex
  patterns (`_HEALTH_PATH_RE`, `_MCP_RE`) for more precise path matching
- Add `_SUCCESS_STATUS_RE` to only suppress 1xx/2xx/3xx responses;
  non-successful responses (4xx, 5xx) still pass through as real signals
- Update docstring to document the new suppression rules clearly
2026-04-18 07:59:23 -04:00
c18a477cda feat: replace MCPToolProgressManager with EnrichedMCPToolProgressManager
Swap out the standard `MCPToolProgressManager` from fast-agent with
the local `EnrichedMCPToolProgressManager` from `pallas.progress` to
provide richer progress reporting during tool execution in the
multimodal MCP server.
2026-04-18 06:02:47 -04:00
065ce0b0dd feat: support per-agent model and capabilities overrides in agents.yaml
Add optional `model` and `model_capabilities` fields to agent definitions
in agents.yaml, allowing each agent to target a different model/provider
with its own capability parameters (vision, context_window, etc.).

- Refactor `_build_agents_table` to return rich dicts instead of tuples
- Extract `_register_one_model` from `_register_unknown_models` for reuse
- Register per-agent models in addition to the global default_model,
  falling back to top-level model_capabilities when agent-specific ones
  are not provided
- Override `AgentConfig.model` at startup when an agent declares a model
- Thread deployment_config through `_preflight` and `_start_agent`
2026-04-15 13:50:20 -04:00
35cc2143b1 Update Red Panda Standards Doc 2026-04-10 14:01:25 +00:00
0cea5ece3a feat: add /healthz and /metrics endpoints, replace print with logging
- Add /healthz endpoint returning LLM provider validation status
- Add /metrics endpoint serving Prometheus metrics via prometheus_client
- Replace all print() calls in health.py with proper logging module
- Remove _PREFIX variable in favor of structured logger context
2026-04-10 11:22:26 +00:00
9092afb532 Initial commit: pallas package extracted from mentor 2026-04-02 12:41:53 +00:00