pallas

r/pallas

Author	SHA1	Message	Date
Robert Helewka	56a1cd0a6c	forward: capture send_request tracebacks before fast-agent drops them fast-agent's MCPAgentClientSession.send_request catches every downstream transport exception, logs the one-line 'send_request failed: <str(e)>' WITHOUT exc_info=True, then re-raises. The exception then propagates up to the agent loop where its message is serialised as the tool result string ('object NoneType can't be used in an await expression' being the canonical symptom) and the traceback is lost forever. Wrap send_request so Pallas emits logger.exception() with the full stack against the 'pallas.forward.trace' logger before re-raising. No behavioural change — we re-raise the same exception; we just get one extra log record with the frames attached, which pallas.log now preserves thanks to the _JSONFormatter traceback field. This will surface the real origin of the NoneType-await that's currently being served as Harper's mnemosyne tool result even though Mnemosyne itself returns 200 OK.	2026-05-06 06:11:00 -04:00
Robert Helewka	ac4af942ab	log: route third-party + traceback records through the pallas log The existing setup only attached the file/stderr handlers to the 'pallas' namespace, so every record emitted by fast-agent, fastmcp, the MCP SDK, Anthropic, uvicorn etc. disappeared into Rich's progress display and never hit pallas.log. When one of those libraries raised and logged 'something failed' via logger.error(..., exc_info=True), we ended up grepping a Rich-overwritten TTY for a traceback that was already long gone -- exactly the situation blocking the current Mnemosyne debug. This patch: * Extends _JSONFormatter to serialise exc_info/stack_info as a 'traceback' field when present, so Loki/grep sees the full stack. * Attaches the same file+stderr handlers to the root logger so every library's records (and any uncaught logger.error tracebacks) land in pallas.log with the stack attached. * Keeps the 'pallas' logger's own handlers (propagate=False) so our records are unaffected by any later root-handler manipulation. * Tags our handlers with _pallas_attached so repeated setup_logging() calls are idempotent -- important because uvicorn workers and fast-agent subagent subprocesses each reinitialise logging. httpx/httpcore stay at WARNING so we don't flood the log with per- request body traces on a DEBUG deployment. Demote third-party namespaces further in a follow-up if needed.	2026-05-05 22:46:14 -04:00
Robert Helewka	66b5dd7bdd	forward: override auth_flow (generic) instead of async_auth_flow The async_auth_flow override was being driven via 'await' in httpx's async dispatcher, which yielded 'NoneType is not awaitable' because a plain generator yielding a Request doesn't produce an awaitable. httpx.Auth has three hooks: sync_auth_flow, async_auth_flow, and the generic auth_flow. The default sync/async implementations delegate to auth_flow when subclasses override only that one, which is exactly the behaviour we want: one plain-generator implementation shared across sync and async clients. Override auth_flow, drop sync/async overrides.	2026-05-05 21:04:45 -04:00
Robert Helewka	f634cc55d8	forward: use httpx.Auth so per-turn bearer survives persistent MCP connections The previous static-header approach only ran at handshake time, and persistent MCP connections reuse the open socket for every subsequent tools/call. The first startup probe had no bearer, so every later tool call inherited an empty Authorization header — Mnemosyne saw no credentials and returned 'Authentication required'. Fix: swap the static header for a _DynamicBearerAuth(httpx.Auth) that httpx consults per-request via async_auth_flow. We look up the current _pending_bearers entry for this server_config and stamp Authorization on each outgoing request individually — no stale caching, no handshake/tool-call skew. Verified chain now runs: bearer.captured (inbound) forward.published (registry key) forward.bound (auth object installed at connect time) forward.applied (stamped per request via async_auth_flow)	2026-05-05 20:57:06 -04:00
Robert Helewka	711f54395d	forward: scan YAML directly for forward_inbound_auth opt-ins Root-cause: fast-agent's Settings(**merged_settings) validation pipeline silently drops unknown keys on nested MCPServerSettings instances — even after flipping extra='allow' and calling model_rebuild(force=True). The culprit is Settings(nested_model_default_partial_update=True) which takes a model_construct path that discards model_extra on the nested model. Verified live: MCPServerSettings.model_validate({'forward_inbound_auth': True}) preserves the field (model_extra={'forward_inbound_auth': True}), but get_settings().mcp.servers['mnemosyne'] returns an instance where the attribute is MISSING and model_extra is None. Fix: parse fastagent.config.yaml ourselves at patch-install time and record the set of opted-in server names in _FORWARD_SERVERS. The patch and multimodal_server's forwardable-config resolver both key off the server name — stable, authoritative, and completely sidesteps Pydantic's extras handling.	2026-05-05 14:59:01 -04:00
Robert Helewka	541b59b4e3	log: add RotatingFileHandler so DEBUG output survives Rich's console takeover fast-agent's progress_display installs a Rich Live renderer on stdout/stderr; plain StreamHandler records get swallowed mid-render, making the bearer- forwarding DEBUG logs invisible on the console. Route every pallas.* record to two sinks: 1. ~/.local/state/pallas/pallas.log (rotating, 10MiB x5) — durable capture regardless of who owns the TTY. Overridable via PALLAS_LOG_FILE. 2. sys.__stderr__ — the original stderr FD captured before Rich could grab it, so records still reach the TTY / journal when DEBUG is on. Avoids /tmp deliberately: systemd PrivateTmp=yes made /tmp/pallas-bearer.log invisible during the original debug saga.	2026-05-05 14:32:27 -04:00
Robert Helewka	7932c72660	log: honour fastagent.config.yaml logger.level so one knob controls Pallas + fast-agent	2026-05-05 14:25:41 -04:00
Robert Helewka	679a809f66	Fix bearer forwarding across anyio TaskGroup boundary The Mnemosyne Authorization: Bearer token was being dropped on outbound MCP calls because fast-agent runs downstream transports inside a long-lived anyio TaskGroup whose context is snapshotted at manager startup — request_bearer_token.get() inside _prepare_headers_and_auth therefore always resolved to None even when the request handler had just set it. Fix: * pallas/_fastagent_patch.py - add _pending_bearers registry keyed by id(server_config) with a threading.Lock; publish_bearer / revoke_bearer helpers. - patched _prepare_headers_and_auth reads the registry first, falls back to the ContextVar for non-persistent probe paths. - emit INFO log on install() so the journal shows the patch ran; verbose flow logs at DEBUG on pallas.forward. * pallas/multimodal_server.py - send_message resolves the agent's opted-in downstreams, publishes the inbound bearer for each, and revokes them all in the finally. - bearer/header diagnostics go to pallas.auth (DEBUG) instead of /tmp/pallas-bearer.log which is invisible under systemd PrivateTmp. * pallas/log.py - honour PALLAS_LOG_LEVEL env var (default INFO) so operators can flip the forward/auth diagnostics on without a code change. * docs/pallas.md, docs/mnemosyne_integration.md - document the registry-based forwarding and the task-group ContextVar constraint that forced it.	2026-05-05 12:09:51 -04:00
Robert Helewka	24c7374f3d	chore(diagnostics): switch bearer token logging to file-based diag log Replace stdlib logger calls for inbound bearer token capture and forward decisions with a `_diag_write` helper that appends to `/tmp/pallas-bearer.log`. This ensures diagnostic output is reliably captured regardless of logger configuration, while swallowing any write errors to avoid impacting request handling.	2026-05-05 06:51:13 -04:00
Robert Helewka	0435f97706	chore(logging): use stdlib logger with plain format strings for auth forwarding	2026-05-04 21:50:57 -04:00
Robert Helewka	68b486d62a	chore(logging): add diagnostic logs for inbound auth forwarding Add info-level logging to trace bearer token capture and forwarding through fastagent, including token length/prefix and reasons for skipping forward (existing user auth, oauth, or missing inbound token). Also log warnings on bearer extraction errors instead of silently swallowing exceptions.	2026-05-04 21:20:49 -04:00
Robert Helewka	e7f1e044b7	fix(pallas): read bearer token from raw Authorization header get_access_token() requires FastMCP auth middleware to populate AuthenticatedUser in the request scope — Pallas runs without auth middleware so it always returned None. Read the Authorization header directly from the ASGI request instead. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-04 18:18:50 -04:00
Robert Helewka	705b4f8cbe	Docs: update	2026-05-04 15:34:51 -04:00
Robert Helewka	be71709608	feat(pallas): add opt-in bearer token forwarding to downstream MCP servers Introduce per-server `forward_inbound_auth` flag that controls whether the inbound MCP bearer token is propagated to outbound MCP transport calls. Implemented as a fast-agent monkey-patch auto-installed on package import, preventing accidental credential leakage to unrelated downstream servers. Update docs to describe the two bearer token consumers (LLM provider passthrough and opt-in downstream MCP forwarding) with a config example.	2026-05-03 17:17:50 -04:00
Robert Helewka	95fa6e6fc0	feat!: stateless per-request agents; add history + conversation_id to send_message Make Pallas truly stateless per the 'Pallas is ephemeral' contract. BREAKING (behavioural, not API): * instance_scope changes from 'shared' to 'request' in pallas.server. Each MCP tools/call now acquires a freshly-created fast-agent instance via the existing create_instance / dispose_instance factories and disposes it immediately after the response. With 'shared' mode: * Every MCP caller saw the same agent.message_history, so different Daedalus conversations leaked into each other. * Mid-chat context was silently truncated once the model window filled. * Restarting the Pallas process wiped all in-flight conversation state, even though Daedalus had it persisted in Postgres. With 'request' mode the Pallas process holds no per-conversation state; the caller (Daedalus) owns history and reseeds it on every turn. send_message gains two optional arguments: * history: list[{role, content, images?}] in chronological order, converted to PromptMessageExtended and seeded onto the fresh instance's message_history before agent.send(). * conversation_id: opaque string, logged for trace correlation only — Pallas never interprets or persists it. Malformed history entries (bad role, missing image data/mime_type, etc.) are skipped with a warning rather than raising, so a single bad row cannot wipe a whole conversation. The {agent}_history MCP prompt is still registered under 'request' scope for backward compatibility but always returns []; history lives on the client. Version bumped to 0.2.0.	2026-04-27 08:16:59 -04:00
Robert Helewka	a5b4650dff	feat(log): suppress successful MCP access logs in health filter Extend `_HealthAccessFilter` to also drop uvicorn access log lines for successful `POST /mcp` requests, in addition to the existing `/live`, `/ready`, and `/metrics` health probes. Why: Every Daedalus health poll and tool call hits the single `/mcp` route. Pallas already emits structured `mcp_request_start` / `mcp_request_complete` logs at the agent layer, making the uvicorn access line pure duplication and noise in syslog. How: - Replace the simple substring list `_HEALTH_PATHS` with compiled regex patterns (`_HEALTH_PATH_RE`, `_MCP_RE`) for more precise path matching - Add `_SUCCESS_STATUS_RE` to only suppress 1xx/2xx/3xx responses; non-successful responses (4xx, 5xx) still pass through as real signals - Update docstring to document the new suppression rules clearly	2026-04-18 07:59:23 -04:00
Robert Helewka	c18a477cda	feat: replace MCPToolProgressManager with EnrichedMCPToolProgressManager Swap out the standard `MCPToolProgressManager` from fast-agent with the local `EnrichedMCPToolProgressManager` from `pallas.progress` to provide richer progress reporting during tool execution in the multimodal MCP server.	2026-04-18 06:02:47 -04:00
Robert Helewka	065ce0b0dd	feat: support per-agent model and capabilities overrides in agents.yaml Add optional `model` and `model_capabilities` fields to agent definitions in agents.yaml, allowing each agent to target a different model/provider with its own capability parameters (vision, context_window, etc.). - Refactor `_build_agents_table` to return rich dicts instead of tuples - Extract `_register_one_model` from `_register_unknown_models` for reuse - Register per-agent models in addition to the global default_model, falling back to top-level model_capabilities when agent-specific ones are not provided - Override `AgentConfig.model` at startup when an agent declares a model - Thread deployment_config through `_preflight` and `_start_agent`	2026-04-15 13:50:20 -04:00
Robert Helewka	35cc2143b1	Update Red Panda Standards Doc	2026-04-10 14:01:25 +00:00
Robert Helewka	0cea5ece3a	feat: add /healthz and /metrics endpoints, replace print with logging - Add /healthz endpoint returning LLM provider validation status - Add /metrics endpoint serving Prometheus metrics via prometheus_client - Replace all print() calls in health.py with proper logging module - Remove _PREFIX variable in favor of structured logger context	2026-04-10 11:22:26 +00:00
Robert Helewka	9092afb532	Initial commit: pallas package extracted from mentor	2026-04-02 12:41:53 +00:00

21 Commits