pallas

r/pallas

Author	SHA1	Message	Date
Robert Helewka	f92f3fc710	feat(log): add service/project/component labels for Loki Inject stable JSON fields (`service`, `project`, `component`) on every log record via a `logging.Filter` so Alloy-forwarded logs share the same label shape as Mnemosyne/Daedalus. The `project` label is set once from `agents.yaml`, and `component` is stored in a ContextVar so each agent's asyncio task carries its own value without leaking across sibling agents. Also switch the opt-in console sink from `sys.__stderr__` to `sys.__stdout__` (`PALLAS_LOG_STDOUT=1`) for cleaner journald capture in systemd deployments.	2026-05-11 13:52:25 -04:00
Robert Helewka	2759c8428e	refactor: remove forward_inbound_auth, add traceback capture patches Retire the per-turn bearer-token forwarding mechanism in favor of transparent authentication via operator-configured headers in fastagent.secrets.yaml. Agents now rely on long-lived team JWTs configured per downstream MCP server. Replace the token-forwarding patches with debug-only traceback-capture wrappers around three opaque fast-agent catch-sites that previously flattened exceptions to bare strings, making downstream transport errors diagnosable. Update README with authentication guidance and deprecation notice for the retired `forward_inbound_auth: true` flag (now silently ignored).	2026-05-10 14:46:39 -04:00
Robert Helewka	49da024877	docs: add Incidents & Lessons Learned section for Pallas<->Mnemosyne saga Capture the five debugging chapters from the bearer-forwarding rollout so the knowledge doesn't live only in chat history: 1. Per-request bearer across anyio.TaskGroup boundary (ContextVar snapshot semantics, httpx auth-header caching on persistent connections, forward_inbound_auth pydantic-drop workaround). 2. install() idempotency guard shadowing three newly-added monkey-patches — each patch now owns its own sentinel. 3. FastMCP on_call_tool context shape: context.message.name, not context.message.params.name. Extractor returning None silently killed the _PUBLIC_TOOLS bypass and downstream dispatch "await None(...)" produced the terse 'object NoneType can't be used in await expression' string that blocked Harper<->Mnemosyne. 4. Rich-TUI corruption by DEBUG openai/sse_starlette/mcp via root logger inheriting logger.level=debug + our stderr StreamHandler. Fixed by PALLAS_LOG_STDERR gate and PALLAS_ROOT_LOG_LEVEL split. 5. Current state table of PALLAS_LOG_* knobs + jq tail recipe. Also add pallas.log and pallas._fastagent_patch to the Module Reference table.	2026-05-07 06:32:24 -04:00
Robert Helewka	89870f4bdc	log: decouple root level from Pallas level; dial noisy libs to WARNING When Pallas runs with logger.level=debug (as Kottos does during the bearer-forwarding shakedown), setting the root logger to DEBUG opens the floodgates for every third-party library to emit DEBUG records. openai._base_client, sse_starlette.sse, mcp, and anthropic each log one line per HTTP request / SSE chunk / JSON-RPC frame; with fast-agent's logger.type=console handler attached to root those lines splatter into the Rich TUI and make the chat unusable. Split the two knobs: * PALLAS_LOG_LEVEL (or fastagent.config logger.level) — drives the pallas.* loggers + file sink. Unchanged. * Root logger level — defaults to the higher of (level, INFO) so third-party DEBUG never bleeds through by default. PALLAS_ROOT_LOG_LEVEL overrides for operators who genuinely want everything at DEBUG. Also extend the noisy-logger list so openai/anthropic/sse_starlette/mcp are individually pinned at WARNING regardless of root — belt-and-braces for the common case.	2026-05-06 20:34:28 -04:00
Robert Helewka	dde7d4fa30	log: gate stderr handler behind PALLAS_LOG_STDERR so fast-agent TUI is usable The stderr StreamHandler, even using sys.__stderr__ captured before Rich installed its Live display, still corrupts the fast-agent TUI in interactive 'fast-agent go' sessions — Rich redraws on top of our JSON log lines but leaks through every repaint, making the interface effectively unusable. Keep the RotatingFileHandler as the always-on durable capture (that's what survives fast-agent's progress_display takeover and what we rely on for diagnostics). Gate the stderr sink behind PALLAS_LOG_STDERR=1 for operators who explicitly want journal/terminal capture on a systemd-managed deployment.	2026-05-06 20:06:17 -04:00
Robert Helewka	082b6111ae	install(): apply trace-capture patches even on reinstall The previous install() short-circuited at the top when _prepare_headers_and_auth was already wrapped, which left the newly added _patch_send_request / _patch_session_call_tool / _patch_execute_on_server helpers unexecuted on any reinstall. That explained why the trace-capture INFO lines never appeared in pallas.log despite the installed _fastagent_patch.py carrying the new code. Restructure install() so the bearer-forwarding block owns its own idempotency guard inline, while the three _patch_* helpers are always invoked — each already has its own 'already patched' guard on the target method, so redundant calls are free and harmless.	2026-05-06 19:28:40 -04:00
Robert Helewka	273b96b370	Add call_tool & _execute_on_server traceback-capture monkeypatches The send_request wrapper (`56a1cd0`) never fires in pallas.log for the NoneType-await failures, proving the offending await lives above send_request in the call stack. Install two additional wrappers to triangulate: * MCPAgentClientSession.call_tool — catches failures in the session's override (meta merge, params, send_request invocation itself, ...). * MCPAggregator._execute_on_server — catches the broadest surface: get_server, session factory, permission check, tracer span, progress callback, try_execute wrapper. Both emit logger.exception(...) with full stack before re-raising; control flow is otherwise untouched. Removable once the offending frame is identified from the resulting traceback.	2026-05-06 18:54:44 -04:00
Robert Helewka	56a1cd0a6c	forward: capture send_request tracebacks before fast-agent drops them fast-agent's MCPAgentClientSession.send_request catches every downstream transport exception, logs the one-line 'send_request failed: <str(e)>' WITHOUT exc_info=True, then re-raises. The exception then propagates up to the agent loop where its message is serialised as the tool result string ('object NoneType can't be used in an await expression' being the canonical symptom) and the traceback is lost forever. Wrap send_request so Pallas emits logger.exception() with the full stack against the 'pallas.forward.trace' logger before re-raising. No behavioural change — we re-raise the same exception; we just get one extra log record with the frames attached, which pallas.log now preserves thanks to the _JSONFormatter traceback field. This will surface the real origin of the NoneType-await that's currently being served as Harper's mnemosyne tool result even though Mnemosyne itself returns 200 OK.	2026-05-06 06:11:00 -04:00
Robert Helewka	ac4af942ab	log: route third-party + traceback records through the pallas log The existing setup only attached the file/stderr handlers to the 'pallas' namespace, so every record emitted by fast-agent, fastmcp, the MCP SDK, Anthropic, uvicorn etc. disappeared into Rich's progress display and never hit pallas.log. When one of those libraries raised and logged 'something failed' via logger.error(..., exc_info=True), we ended up grepping a Rich-overwritten TTY for a traceback that was already long gone -- exactly the situation blocking the current Mnemosyne debug. This patch: * Extends _JSONFormatter to serialise exc_info/stack_info as a 'traceback' field when present, so Loki/grep sees the full stack. * Attaches the same file+stderr handlers to the root logger so every library's records (and any uncaught logger.error tracebacks) land in pallas.log with the stack attached. * Keeps the 'pallas' logger's own handlers (propagate=False) so our records are unaffected by any later root-handler manipulation. * Tags our handlers with _pallas_attached so repeated setup_logging() calls are idempotent -- important because uvicorn workers and fast-agent subagent subprocesses each reinitialise logging. httpx/httpcore stay at WARNING so we don't flood the log with per- request body traces on a DEBUG deployment. Demote third-party namespaces further in a follow-up if needed.	2026-05-05 22:46:14 -04:00
Robert Helewka	66b5dd7bdd	forward: override auth_flow (generic) instead of async_auth_flow The async_auth_flow override was being driven via 'await' in httpx's async dispatcher, which yielded 'NoneType is not awaitable' because a plain generator yielding a Request doesn't produce an awaitable. httpx.Auth has three hooks: sync_auth_flow, async_auth_flow, and the generic auth_flow. The default sync/async implementations delegate to auth_flow when subclasses override only that one, which is exactly the behaviour we want: one plain-generator implementation shared across sync and async clients. Override auth_flow, drop sync/async overrides.	2026-05-05 21:04:45 -04:00
Robert Helewka	f634cc55d8	forward: use httpx.Auth so per-turn bearer survives persistent MCP connections The previous static-header approach only ran at handshake time, and persistent MCP connections reuse the open socket for every subsequent tools/call. The first startup probe had no bearer, so every later tool call inherited an empty Authorization header — Mnemosyne saw no credentials and returned 'Authentication required'. Fix: swap the static header for a _DynamicBearerAuth(httpx.Auth) that httpx consults per-request via async_auth_flow. We look up the current _pending_bearers entry for this server_config and stamp Authorization on each outgoing request individually — no stale caching, no handshake/tool-call skew. Verified chain now runs: bearer.captured (inbound) forward.published (registry key) forward.bound (auth object installed at connect time) forward.applied (stamped per request via async_auth_flow)	2026-05-05 20:57:06 -04:00
Robert Helewka	711f54395d	forward: scan YAML directly for forward_inbound_auth opt-ins Root-cause: fast-agent's Settings(**merged_settings) validation pipeline silently drops unknown keys on nested MCPServerSettings instances — even after flipping extra='allow' and calling model_rebuild(force=True). The culprit is Settings(nested_model_default_partial_update=True) which takes a model_construct path that discards model_extra on the nested model. Verified live: MCPServerSettings.model_validate({'forward_inbound_auth': True}) preserves the field (model_extra={'forward_inbound_auth': True}), but get_settings().mcp.servers['mnemosyne'] returns an instance where the attribute is MISSING and model_extra is None. Fix: parse fastagent.config.yaml ourselves at patch-install time and record the set of opted-in server names in _FORWARD_SERVERS. The patch and multimodal_server's forwardable-config resolver both key off the server name — stable, authoritative, and completely sidesteps Pydantic's extras handling.	2026-05-05 14:59:01 -04:00
Robert Helewka	541b59b4e3	log: add RotatingFileHandler so DEBUG output survives Rich's console takeover fast-agent's progress_display installs a Rich Live renderer on stdout/stderr; plain StreamHandler records get swallowed mid-render, making the bearer- forwarding DEBUG logs invisible on the console. Route every pallas.* record to two sinks: 1. ~/.local/state/pallas/pallas.log (rotating, 10MiB x5) — durable capture regardless of who owns the TTY. Overridable via PALLAS_LOG_FILE. 2. sys.__stderr__ — the original stderr FD captured before Rich could grab it, so records still reach the TTY / journal when DEBUG is on. Avoids /tmp deliberately: systemd PrivateTmp=yes made /tmp/pallas-bearer.log invisible during the original debug saga.	2026-05-05 14:32:27 -04:00
Robert Helewka	7932c72660	log: honour fastagent.config.yaml logger.level so one knob controls Pallas + fast-agent	2026-05-05 14:25:41 -04:00
Robert Helewka	679a809f66	Fix bearer forwarding across anyio TaskGroup boundary The Mnemosyne Authorization: Bearer token was being dropped on outbound MCP calls because fast-agent runs downstream transports inside a long-lived anyio TaskGroup whose context is snapshotted at manager startup — request_bearer_token.get() inside _prepare_headers_and_auth therefore always resolved to None even when the request handler had just set it. Fix: * pallas/_fastagent_patch.py - add _pending_bearers registry keyed by id(server_config) with a threading.Lock; publish_bearer / revoke_bearer helpers. - patched _prepare_headers_and_auth reads the registry first, falls back to the ContextVar for non-persistent probe paths. - emit INFO log on install() so the journal shows the patch ran; verbose flow logs at DEBUG on pallas.forward. * pallas/multimodal_server.py - send_message resolves the agent's opted-in downstreams, publishes the inbound bearer for each, and revokes them all in the finally. - bearer/header diagnostics go to pallas.auth (DEBUG) instead of /tmp/pallas-bearer.log which is invisible under systemd PrivateTmp. * pallas/log.py - honour PALLAS_LOG_LEVEL env var (default INFO) so operators can flip the forward/auth diagnostics on without a code change. * docs/pallas.md, docs/mnemosyne_integration.md - document the registry-based forwarding and the task-group ContextVar constraint that forced it.	2026-05-05 12:09:51 -04:00
Robert Helewka	24c7374f3d	chore(diagnostics): switch bearer token logging to file-based diag log Replace stdlib logger calls for inbound bearer token capture and forward decisions with a `_diag_write` helper that appends to `/tmp/pallas-bearer.log`. This ensures diagnostic output is reliably captured regardless of logger configuration, while swallowing any write errors to avoid impacting request handling.	2026-05-05 06:51:13 -04:00
Robert Helewka	0435f97706	chore(logging): use stdlib logger with plain format strings for auth forwarding	2026-05-04 21:50:57 -04:00
Robert Helewka	68b486d62a	chore(logging): add diagnostic logs for inbound auth forwarding Add info-level logging to trace bearer token capture and forwarding through fastagent, including token length/prefix and reasons for skipping forward (existing user auth, oauth, or missing inbound token). Also log warnings on bearer extraction errors instead of silently swallowing exceptions.	2026-05-04 21:20:49 -04:00
Robert Helewka	e7f1e044b7	fix(pallas): read bearer token from raw Authorization header get_access_token() requires FastMCP auth middleware to populate AuthenticatedUser in the request scope — Pallas runs without auth middleware so it always returned None. Read the Authorization header directly from the ASGI request instead. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-04 18:18:50 -04:00
Robert Helewka	705b4f8cbe	Docs: update	2026-05-04 15:34:51 -04:00
Robert Helewka	be71709608	feat(pallas): add opt-in bearer token forwarding to downstream MCP servers Introduce per-server `forward_inbound_auth` flag that controls whether the inbound MCP bearer token is propagated to outbound MCP transport calls. Implemented as a fast-agent monkey-patch auto-installed on package import, preventing accidental credential leakage to unrelated downstream servers. Update docs to describe the two bearer token consumers (LLM provider passthrough and opt-in downstream MCP forwarding) with a config example.	2026-05-03 17:17:50 -04:00
Robert Helewka	95fa6e6fc0	feat!: stateless per-request agents; add history + conversation_id to send_message Make Pallas truly stateless per the 'Pallas is ephemeral' contract. BREAKING (behavioural, not API): * instance_scope changes from 'shared' to 'request' in pallas.server. Each MCP tools/call now acquires a freshly-created fast-agent instance via the existing create_instance / dispose_instance factories and disposes it immediately after the response. With 'shared' mode: * Every MCP caller saw the same agent.message_history, so different Daedalus conversations leaked into each other. * Mid-chat context was silently truncated once the model window filled. * Restarting the Pallas process wiped all in-flight conversation state, even though Daedalus had it persisted in Postgres. With 'request' mode the Pallas process holds no per-conversation state; the caller (Daedalus) owns history and reseeds it on every turn. send_message gains two optional arguments: * history: list[{role, content, images?}] in chronological order, converted to PromptMessageExtended and seeded onto the fresh instance's message_history before agent.send(). * conversation_id: opaque string, logged for trace correlation only — Pallas never interprets or persists it. Malformed history entries (bad role, missing image data/mime_type, etc.) are skipped with a warning rather than raising, so a single bad row cannot wipe a whole conversation. The {agent}_history MCP prompt is still registered under 'request' scope for backward compatibility but always returns []; history lives on the client. Version bumped to 0.2.0.	2026-04-27 08:16:59 -04:00
Robert Helewka	a5b4650dff	feat(log): suppress successful MCP access logs in health filter Extend `_HealthAccessFilter` to also drop uvicorn access log lines for successful `POST /mcp` requests, in addition to the existing `/live`, `/ready`, and `/metrics` health probes. Why: Every Daedalus health poll and tool call hits the single `/mcp` route. Pallas already emits structured `mcp_request_start` / `mcp_request_complete` logs at the agent layer, making the uvicorn access line pure duplication and noise in syslog. How: - Replace the simple substring list `_HEALTH_PATHS` with compiled regex patterns (`_HEALTH_PATH_RE`, `_MCP_RE`) for more precise path matching - Add `_SUCCESS_STATUS_RE` to only suppress 1xx/2xx/3xx responses; non-successful responses (4xx, 5xx) still pass through as real signals - Update docstring to document the new suppression rules clearly	2026-04-18 07:59:23 -04:00
Robert Helewka	c18a477cda	feat: replace MCPToolProgressManager with EnrichedMCPToolProgressManager Swap out the standard `MCPToolProgressManager` from fast-agent with the local `EnrichedMCPToolProgressManager` from `pallas.progress` to provide richer progress reporting during tool execution in the multimodal MCP server.	2026-04-18 06:02:47 -04:00
Robert Helewka	065ce0b0dd	feat: support per-agent model and capabilities overrides in agents.yaml Add optional `model` and `model_capabilities` fields to agent definitions in agents.yaml, allowing each agent to target a different model/provider with its own capability parameters (vision, context_window, etc.). - Refactor `_build_agents_table` to return rich dicts instead of tuples - Extract `_register_one_model` from `_register_unknown_models` for reuse - Register per-agent models in addition to the global default_model, falling back to top-level model_capabilities when agent-specific ones are not provided - Override `AgentConfig.model` at startup when an agent declares a model - Thread deployment_config through `_preflight` and `_start_agent`	2026-04-15 13:50:20 -04:00
Robert Helewka	35cc2143b1	Update Red Panda Standards Doc	2026-04-10 14:01:25 +00:00
Robert Helewka	0cea5ece3a	feat: add /healthz and /metrics endpoints, replace print with logging - Add /healthz endpoint returning LLM provider validation status - Add /metrics endpoint serving Prometheus metrics via prometheus_client - Replace all print() calls in health.py with proper logging module - Remove _PREFIX variable in favor of structured logger context	2026-04-10 11:22:26 +00:00
Robert Helewka	9092afb532	Initial commit: pallas package extracted from mentor	2026-04-02 12:41:53 +00:00

28 Commits