chore(ansible): update model endpoints and enable Rommie deployment
- Bump Qwen model from 3.5 to 3.6 and update inference endpoints (nyx:22079→22072, pan:22078→22076) for caliban and puck hosts - Add Rommie MCP server deployment to site.yml - Update Rommie docs to reflect new port (20361), model versions, and health check accepting 200/406 status codes
This commit is contained in:
@@ -79,8 +79,8 @@ pallas_log_level: INFO
|
||||
kottos_fastagent_log_level: info
|
||||
|
||||
# LLM provider — the same OpenAI-compatible Qwen endpoint Kottos uses today.
|
||||
kottos_default_model: "openai.Qwen3.5-35B-A3B-UD-Q4_K_XL.gguf"
|
||||
kottos_openai_base_url: "http://nyx.helu.ca:22079/v1"
|
||||
kottos_default_model: "openai.Qwen3.6-35B-A3B-UD-Q4_K_XL.gguf"
|
||||
kottos_openai_base_url: "http://nyx.helu.ca:22072/v1"
|
||||
kottos_model_vision: true
|
||||
kottos_model_context_window: 192000
|
||||
kottos_model_max_output_tokens: 16384
|
||||
|
||||
Reference in New Issue
Block a user