diff --git a/docs/iolaus.md b/docs/iolaus.md index 61b60f7..91deaa3 100644 --- a/docs/iolaus.md +++ b/docs/iolaus.md @@ -95,7 +95,7 @@ Committed to the repo. Contains LLM provider settings and explicit model capabil declarations. ```yaml -default_model: openai.Qwen3.5-35B-A3B-UD-Q4_K_XL.gguf +default_model: generic.Qwen3.5-35B-A3B-UD-Q4_K_XL.gguf model_capabilities: vision: false @@ -249,6 +249,7 @@ sudo systemctl status iolaus - **Python 3.13** required (`fast-agent-mcp` pins `>=3.13`) - **Runtime:** [Pallas](https://git.helu.ca/r/pallas) — `pallas-mcp @ git+ssh://git@git.helu.ca:22022/r/pallas.git` - **Transport:** StreamableHTTP (`/mcp`) throughout — not SSE -- **LLM:** OpenAI-compatible API at `http://nyx.helu.ca:22079/v1` (personal Qwen model) +- **LLM:** Local Qwen via fast-agent's Generic (OpenAI-compatible) provider at + `http://nyx.helu.ca:22079/v1` - **Logging:** Console output — stdout → syslog → Alloy → Loki in production - **Port scheme:** registry at 24000, personal agents 24001–24049, sub-agents 24050–24099 diff --git a/docs/kottos.md b/docs/kottos.md index 957a761..f344b0c 100644 --- a/docs/kottos.md +++ b/docs/kottos.md @@ -89,7 +89,7 @@ In Ansible-managed deployments this file is replaced by the for model, MCP URLs, etc. ```yaml -default_model: openai.Qwen3.5-35B-A3B-UD-Q4_K_XL.gguf +default_model: generic.Qwen3.5-35B-A3B-UD-Q4_K_XL.gguf model_capabilities: vision: false @@ -199,8 +199,11 @@ kottos_scotty_port: 24102 kottos_research_port: 24150 kottos_tech_research_port: 24151 pallas_log_level: INFO -kottos_default_model: "openai.Qwen3.5-35B-A3B-UD-Q4_K_XL.gguf" -kottos_openai_base_url: "http://nyx.helu.ca:22079/v1" +# Local Qwen served via fast-agent's Generic (OpenAI-compatible) provider. +# The openai_base_url slot is reserved for cloud OpenAI endpoints (e.g. +# Bedrock Mantle Chat Completions). +kottos_default_model: "generic.Qwen3.5-35B-A3B-UD-Q4_K_XL.gguf" +kottos_generic_base_url: "http://nyx.helu.ca:22079/v1" # ...plus one entry per downstream MCP URL so each environment overrides freely ``` @@ -274,6 +277,7 @@ See [logging.md](logging.md) for the full label schema + level policy + add-a-ne - **Python 3.13** required (`fast-agent-mcp` pins `>=3.13`) - **Runtime:** [Pallas](https://git.helu.ca/r/pallas) — `pallas-mcp @ git+ssh://git@git.helu.ca:22022/r/pallas.git` - **Transport:** StreamableHTTP (`/mcp`) throughout — not SSE -- **LLM:** OpenAI-compatible API at `http://nyx.helu.ca:22079/v1` (personal Qwen model) +- **LLM:** Local Qwen via fast-agent's Generic (OpenAI-compatible) provider at + `http://nyx.helu.ca:22079/v1` - **Logging:** Console output — stdout → syslog → Alloy → Loki in production - **Port scheme:** registry at 24100, agents 24101–24149, sub-agents 24150–24199