Move llama-cpp to generic fastagent slot

This commit is contained in:
2026-05-12 15:07:00 -04:00
parent 8c95173705
commit b2fc398782
2 changed files with 11 additions and 6 deletions

View File

@@ -95,7 +95,7 @@ Committed to the repo. Contains LLM provider settings and explicit model capabil
declarations. declarations.
```yaml ```yaml
default_model: openai.Qwen3.5-35B-A3B-UD-Q4_K_XL.gguf default_model: generic.Qwen3.5-35B-A3B-UD-Q4_K_XL.gguf
model_capabilities: model_capabilities:
vision: false vision: false
@@ -249,6 +249,7 @@ sudo systemctl status iolaus
- **Python 3.13** required (`fast-agent-mcp` pins `>=3.13`) - **Python 3.13** required (`fast-agent-mcp` pins `>=3.13`)
- **Runtime:** [Pallas](https://git.helu.ca/r/pallas) — `pallas-mcp @ git+ssh://git@git.helu.ca:22022/r/pallas.git` - **Runtime:** [Pallas](https://git.helu.ca/r/pallas) — `pallas-mcp @ git+ssh://git@git.helu.ca:22022/r/pallas.git`
- **Transport:** StreamableHTTP (`/mcp`) throughout — not SSE - **Transport:** StreamableHTTP (`/mcp`) throughout — not SSE
- **LLM:** OpenAI-compatible API at `http://nyx.helu.ca:22079/v1` (personal Qwen model) - **LLM:** Local Qwen via fast-agent's Generic (OpenAI-compatible) provider at
`http://nyx.helu.ca:22079/v1`
- **Logging:** Console output — stdout → syslog → Alloy → Loki in production - **Logging:** Console output — stdout → syslog → Alloy → Loki in production
- **Port scheme:** registry at 24000, personal agents 2400124049, sub-agents 2405024099 - **Port scheme:** registry at 24000, personal agents 2400124049, sub-agents 2405024099

View File

@@ -89,7 +89,7 @@ In Ansible-managed deployments this file is replaced by the
for model, MCP URLs, etc. for model, MCP URLs, etc.
```yaml ```yaml
default_model: openai.Qwen3.5-35B-A3B-UD-Q4_K_XL.gguf default_model: generic.Qwen3.5-35B-A3B-UD-Q4_K_XL.gguf
model_capabilities: model_capabilities:
vision: false vision: false
@@ -199,8 +199,11 @@ kottos_scotty_port: 24102
kottos_research_port: 24150 kottos_research_port: 24150
kottos_tech_research_port: 24151 kottos_tech_research_port: 24151
pallas_log_level: INFO pallas_log_level: INFO
kottos_default_model: "openai.Qwen3.5-35B-A3B-UD-Q4_K_XL.gguf" # Local Qwen served via fast-agent's Generic (OpenAI-compatible) provider.
kottos_openai_base_url: "http://nyx.helu.ca:22079/v1" # The openai_base_url slot is reserved for cloud OpenAI endpoints (e.g.
# Bedrock Mantle Chat Completions).
kottos_default_model: "generic.Qwen3.5-35B-A3B-UD-Q4_K_XL.gguf"
kottos_generic_base_url: "http://nyx.helu.ca:22079/v1"
# ...plus one entry per downstream MCP URL so each environment overrides freely # ...plus one entry per downstream MCP URL so each environment overrides freely
``` ```
@@ -274,6 +277,7 @@ See [logging.md](logging.md) for the full label schema + level policy + add-a-ne
- **Python 3.13** required (`fast-agent-mcp` pins `>=3.13`) - **Python 3.13** required (`fast-agent-mcp` pins `>=3.13`)
- **Runtime:** [Pallas](https://git.helu.ca/r/pallas) — `pallas-mcp @ git+ssh://git@git.helu.ca:22022/r/pallas.git` - **Runtime:** [Pallas](https://git.helu.ca/r/pallas) — `pallas-mcp @ git+ssh://git@git.helu.ca:22022/r/pallas.git`
- **Transport:** StreamableHTTP (`/mcp`) throughout — not SSE - **Transport:** StreamableHTTP (`/mcp`) throughout — not SSE
- **LLM:** OpenAI-compatible API at `http://nyx.helu.ca:22079/v1` (personal Qwen model) - **LLM:** Local Qwen via fast-agent's Generic (OpenAI-compatible) provider at
`http://nyx.helu.ca:22079/v1`
- **Logging:** Console output — stdout → syslog → Alloy → Loki in production - **Logging:** Console output — stdout → syslog → Alloy → Loki in production
- **Port scheme:** registry at 24100, agents 2410124149, sub-agents 2415024199 - **Port scheme:** registry at 24100, agents 2410124149, sub-agents 2415024199