feat!: stateless per-request agents; add history + conversation_id to send_message
Make Pallas truly stateless per the 'Pallas is ephemeral' contract.
BREAKING (behavioural, not API):
* instance_scope changes from 'shared' to 'request' in pallas.server.
Each MCP tools/call now acquires a freshly-created fast-agent instance
via the existing create_instance / dispose_instance factories and
disposes it immediately after the response.
With 'shared' mode:
* Every MCP caller saw the same agent.message_history, so different
Daedalus conversations leaked into each other.
* Mid-chat context was silently truncated once the model window filled.
* Restarting the Pallas process wiped all in-flight conversation state,
even though Daedalus had it persisted in Postgres.
With 'request' mode the Pallas process holds no per-conversation state;
the caller (Daedalus) owns history and reseeds it on every turn.
send_message gains two optional arguments:
* history: list[{role, content, images?}] in chronological order,
converted to PromptMessageExtended and seeded onto the fresh
instance's message_history before agent.send().
* conversation_id: opaque string, logged for trace correlation only —
Pallas never interprets or persists it.
Malformed history entries (bad role, missing image data/mime_type, etc.)
are skipped with a warning rather than raising, so a single bad row
cannot wipe a whole conversation.
The {agent}_history MCP prompt is still registered under 'request'
scope for backward compatibility but always returns []; history lives
on the client.
Version bumped to 0.2.0.
This commit is contained in:
@@ -132,7 +132,56 @@ No authentication. No query parameters.
|
||||
|
||||
---
|
||||
|
||||
## 2. Health Tool
|
||||
## 2. Conversation State & History (Daedalus-owned)
|
||||
|
||||
**Pallas is stateless.** As of version `0.2.0`, every MCP `tools/call` is
|
||||
handled by a freshly-created fast-agent instance that is disposed immediately
|
||||
after the response. The Pallas process holds **no per-conversation memory
|
||||
between calls**. This is enforced by `instance_scope="request"` in
|
||||
`pallas.server` — do not override it.
|
||||
|
||||
Conversation history is owned by the client (Daedalus). It must be replayed
|
||||
on every turn through the `history` argument on `send_message`.
|
||||
|
||||
### `send_message` Arguments
|
||||
|
||||
Each agent's MCP tool accepts:
|
||||
|
||||
| Parameter | Type | Required | Description |
|
||||
|-----------|------|----------|-------------|
|
||||
| `message` | `str` | yes | The new user turn as plain text. |
|
||||
| `images` | `list[dict]` | no | Images attached to this turn only: `[{"data": base64, "mime_type": "image/png"}]`. Requires a vision-capable model. |
|
||||
| `history` | `list[dict]` | no | Prior conversation history in chronological order. Entries have shape `{"role": "user" \| "assistant", "content": str, "images"?: [...]}`. When present, seeds the freshly-created agent's `message_history` *before* the new turn is executed. |
|
||||
| `conversation_id` | `str` | no | Opaque identifier logged by Pallas for trace correlation. Pallas does not interpret or persist it. |
|
||||
|
||||
### Rationale
|
||||
|
||||
| Problem with shared state | Behaviour with `instance_scope="request"` |
|
||||
|---------------------------|-------------------------------------------|
|
||||
| Every caller sees the same `agent.message_history`, so different conversations leak into each other. | Each call gets a fresh, isolated instance. No cross-conversation bleed. |
|
||||
| Process restart wipes all in-flight context. | There was no in-flight context to wipe — Daedalus reseeds it on the next turn. |
|
||||
| Context-window trimming happens invisibly inside fast-agent. | Daedalus decides what history to send and how much, based on `capabilities.context_window` from the registry. |
|
||||
|
||||
### `{agent}_history` Prompt
|
||||
|
||||
Under `instance_scope="request"` the `{agent}_history` MCP prompt is still
|
||||
registered for backward compatibility but always returns `[]` — history lives
|
||||
on the client and there is no authoritative server-side copy. Existing
|
||||
callers that invoke this prompt will not error, but should migrate to
|
||||
tracking history client-side.
|
||||
|
||||
### Backward Compatibility
|
||||
|
||||
All new arguments are optional. A client that calls `send_message(message=...)`
|
||||
with no `history` and no `conversation_id` gets a *zero-history* turn (the
|
||||
agent sees only the current message). This is correct stateless behaviour —
|
||||
it is never "the last conversation's context". Existing fast-agent MCP
|
||||
clients that do not know about `history` will produce one-shot responses,
|
||||
which is the appropriate and visible failure mode.
|
||||
|
||||
---
|
||||
|
||||
## 3. Health Tool
|
||||
|
||||
### MCP tool: `get_health`
|
||||
|
||||
@@ -239,7 +288,7 @@ The tool **must not** invoke the LLM. It should complete in under 1 second (3-se
|
||||
|
||||
---
|
||||
|
||||
## 3. Daedalus Consumption
|
||||
## 4. Daedalus Consumption
|
||||
|
||||
### Registration Flow
|
||||
|
||||
@@ -270,7 +319,7 @@ The tool **must not** invoke the LLM. It should complete in under 1 second (3-se
|
||||
|
||||
---
|
||||
|
||||
## 4. Agent Progress Notifications
|
||||
## 5. Agent Progress Notifications
|
||||
|
||||
Agent tool calls can take tens of seconds to minutes when the agent enters an agentic loop — calling sub-agents, searching the web, querying knowledge graphs, etc. During this time, the MCP tool call has not yet returned. Without progress feedback, the user sees a dead spinner.
|
||||
|
||||
@@ -363,7 +412,7 @@ Progress messages follow predictable patterns:
|
||||
|
||||
---
|
||||
|
||||
## 5. Why MCP (Not REST)
|
||||
## 6. Why MCP (Not REST)
|
||||
|
||||
Pallas wraps each FastAgent instance in a `MultimodalAgentMCPServer` and serves it over StreamableHTTP. The MCP transport gives Daedalus:
|
||||
|
||||
|
||||
Reference in New Issue
Block a user