docs(pallas): expand LLM preflight docs and refactor health probes
This commit is contained in:
@@ -329,19 +329,24 @@ BEDROCK_API_KEY=your-bedrock-long-term-api-key
|
||||
|
||||
### Startup preflight
|
||||
|
||||
Pallas's `validate_llm_providers()` runs at startup and checks:
|
||||
Pallas's `validate_llm_providers()` runs at startup and caches a status for the *active* provider (the one named by `default_model`). The cached value is read back by `get_health()` on every MCP `get_health` tool call, so Daedalus (or any headless consumer) can see *why* an agent is degraded when there's no fast-agent TUI to surface it.
|
||||
|
||||
| Provider | What is checked |
|
||||
Preflight probes are deliberately chosen to be **free of inference tokens**. Each provider has a dedicated probe:
|
||||
|
||||
| Provider | Probe |
|
||||
|---|---|
|
||||
| `anthropic` | `GET {base_url}/v1/models/{model}` — confirms model exists and key is valid |
|
||||
| `anthropic` (direct — `api.anthropic.com` or empty `base_url`) | `GET {base_url}/models/{model}` — confirms model exists and the API key is valid |
|
||||
| `anthropic` (Mantle — `bedrock-mantle.{region}.api.aws/anthropic`) | `GET {region_root}/v1/models/{wire_model}` — Mantle serves its model catalogue at the **region root**, not under `/anthropic`; Pallas strips the `/anthropic` suffix and applies `pallas.mantle_shims.MANTLE_WIRE_NAMES` to turn `claude-opus-4-7` into `anthropic.claude-opus-4-7`. The IAM policy for the long-term Bedrock API key must include `bedrock-mantle:ListModels` / `bedrock-mantle:GetModel` for this probe to return 200. |
|
||||
| `openai` | `GET {base_url}/models` — lists models, confirms configured model is present |
|
||||
| `bedrock` | **No preflight check** — credential errors surface on the first inference call |
|
||||
| `generic` | `GET {base_url}/models` — status-code-only probe (body is not inspected). llama.cpp's `/v1/models` response isn't strictly OpenAI-shaped and users hot-swap models by name, so a 200 is enough |
|
||||
| `bedrock` | **No HTTP request.** `ok` when any of `AWS_BEARER_TOKEN_BEDROCK`, `AWS_ACCESS_KEY_ID`+`AWS_SECRET_ACCESS_KEY`, `AWS_PROFILE`, or `~/.aws/credentials` is present; `error` otherwise. Bedrock's Converse API has no cheap health endpoint and the first inference call will surface any real credential problem within seconds |
|
||||
| Unknown / malformed provider | No HTTP request; `error: unknown provider 'X' in default_model`. Prevents silent "looks degraded" lies when `default_model` is mistyped |
|
||||
|
||||
For the `bedrock` provider, startup will succeed even with missing or invalid credentials. The first agent call will raise a `ProviderKeyError` with a message directing you to configure AWS credentials.
|
||||
API key resolution for every provider goes through `fast_agent.llm.provider_key_manager.ProviderKeyManager.get_api_key`, so the preflight reads keys from the exact same place the real LLM client does — config file, env var, Codex OAuth, HF hub, etc. Duplicate key-loading logic inside `pallas.health` has been removed.
|
||||
|
||||
### Runtime `get_health` tool
|
||||
|
||||
The `get_health` MCP tool probes downstream MCP servers regardless of which LLM provider is active. LLM provider health (from the startup preflight) is included in the response for `anthropic` and `openai` providers. For `bedrock`, the LLM section of the health response will be absent.
|
||||
The `get_health` MCP tool probes downstream MCP servers on every call and includes the cached LLM preflight status in the response. If the active provider's cached status isn't `ok`, `get_health` returns `status: degraded` with an `LLM: <provider>: <message>` prefix appended to the `message` field.
|
||||
|
||||
---
|
||||
|
||||
|
||||
413
pallas/health.py
413
pallas/health.py
@@ -1,8 +1,32 @@
|
||||
"""
|
||||
Health check module for Pallas.
|
||||
|
||||
Probes downstream MCP server connectivity and exposes a get_health MCP tool.
|
||||
Validates LLM provider API keys and model availability at startup.
|
||||
Probes downstream MCP server connectivity and exposes a ``get_health`` MCP
|
||||
tool. At startup, :func:`validate_llm_providers` runs a cheap, per-provider
|
||||
preflight so Daedalus (or any headless consumer) can see *why* an agent
|
||||
might be degraded when there is no fast-agent TUI to surface it — see
|
||||
``docs/pallas_integration.md`` § Runtime get_health tool.
|
||||
|
||||
Preflight dispatch matrix:
|
||||
|
||||
========================== ======================================== =======================================
|
||||
Provider Probe Success criterion
|
||||
========================== ======================================== =======================================
|
||||
``anthropic`` (direct) ``GET {base_url}/models/{model}`` HTTP 200
|
||||
``anthropic`` (Mantle) ``GET {mantle_root}/v1/models/{wire}`` HTTP 200 (wire-name via mantle_shims)
|
||||
``openai`` ``GET {base_url}/models`` HTTP 200 and active model in list
|
||||
``generic`` ``GET {base_url}/models`` HTTP 200 (body not inspected)
|
||||
``bedrock`` none ok if AWS creds resolvable
|
||||
unknown provider none error — surfaces honestly to Daedalus
|
||||
========================== ======================================== =======================================
|
||||
|
||||
API keys are resolved via :class:`fast_agent.llm.provider_key_manager.ProviderKeyManager`
|
||||
so this module sees identical secret-loading behaviour to the real LLM
|
||||
client path. We never duplicate key-resolution logic here.
|
||||
|
||||
Endpoints that cost inference tokens are deliberately avoided. Mantle's
|
||||
anthropic probe uses the (token-free) model catalogue at the region root
|
||||
(``/v1/models/{wire}``), *not* ``POST /anthropic/v1/messages``.
|
||||
"""
|
||||
|
||||
import asyncio
|
||||
@@ -12,6 +36,7 @@ import os
|
||||
import re
|
||||
from datetime import datetime, timezone
|
||||
from pathlib import Path
|
||||
from typing import Any
|
||||
|
||||
import httpx
|
||||
import yaml
|
||||
@@ -35,16 +60,20 @@ def _load_deployment_name() -> str:
|
||||
|
||||
_DEPLOY_NAME = _load_deployment_name()
|
||||
|
||||
# ── Provider API endpoints ───────────────────────────────────────────────────
|
||||
# ── Default endpoints (only used when the provider section is missing) ───────
|
||||
|
||||
_ANTHROPIC_API = "https://api.anthropic.com/v1"
|
||||
_ANTHROPIC_DEFAULT_API = "https://api.anthropic.com/v1"
|
||||
_OPENAI_DEFAULT_API = "https://api.openai.com/v1"
|
||||
_GENERIC_DEFAULT_API = "http://localhost:11434/v1"
|
||||
|
||||
# Populated by validate_llm_providers() at startup, read by get_health()
|
||||
_llm_status: dict[str, dict] = {}
|
||||
_active_provider: str = ""
|
||||
|
||||
|
||||
# ── Config loading ───────────────────────────────────────────────────────────
|
||||
|
||||
|
||||
def _load_dotenv() -> None:
|
||||
"""Load .env file into os.environ (without overwriting existing vars)."""
|
||||
env_path = _config_root() / ".env"
|
||||
@@ -70,6 +99,17 @@ def _expand_env(value: str) -> str:
|
||||
)
|
||||
|
||||
|
||||
def _expand_env_in_tree(obj: Any) -> Any:
|
||||
"""Recursively expand ${ENV_VAR} placeholders inside a parsed YAML tree."""
|
||||
if isinstance(obj, str):
|
||||
return _expand_env(obj)
|
||||
if isinstance(obj, dict):
|
||||
return {k: _expand_env_in_tree(v) for k, v in obj.items()}
|
||||
if isinstance(obj, list):
|
||||
return [_expand_env_in_tree(v) for v in obj]
|
||||
return obj
|
||||
|
||||
|
||||
def _load_config() -> tuple[dict, dict]:
|
||||
"""Load fastagent config and secrets YAML from the working directory."""
|
||||
root = _config_root()
|
||||
@@ -79,11 +119,40 @@ def _load_config() -> tuple[dict, dict]:
|
||||
return config, secrets
|
||||
|
||||
|
||||
async def _check_anthropic(client: httpx.AsyncClient, api_key: str, model_id: str) -> str | None:
|
||||
"""Validate an Anthropic model. Returns None on success, error message on failure."""
|
||||
def _merge_for_key_manager(config: dict, secrets: dict) -> dict:
|
||||
"""Produce the merged, env-expanded dict that ProviderKeyManager expects.
|
||||
|
||||
``ProviderKeyManager.get_config_file_key`` looks for ``<provider>.api_key``
|
||||
in a single flat dict. It does not apply ``${ENV_VAR}`` expansion itself,
|
||||
so we pre-expand the whole tree.
|
||||
"""
|
||||
merged: dict = {}
|
||||
for source in (config or {}, secrets or {}):
|
||||
for provider_name, settings in (source or {}).items():
|
||||
if isinstance(settings, dict):
|
||||
target = merged.setdefault(provider_name, {})
|
||||
if isinstance(target, dict):
|
||||
target.update(settings)
|
||||
return _expand_env_in_tree(merged)
|
||||
|
||||
|
||||
# ── Per-provider probes ──────────────────────────────────────────────────────
|
||||
|
||||
|
||||
async def _check_anthropic(
|
||||
client: httpx.AsyncClient, api_key: str, model_id: str, base_url: str
|
||||
) -> str | None:
|
||||
"""Validate an Anthropic model via ``GET {base_url}/models/{model_id}``.
|
||||
|
||||
Works for both the public ``api.anthropic.com`` endpoint and for AWS
|
||||
Bedrock Mantle's region-root model catalogue
|
||||
(``https://bedrock-mantle.{region}.api.aws/v1/models/{wire_id}``).
|
||||
The caller is responsible for passing the correct ``base_url`` and the
|
||||
wire-name form of ``model_id`` for Mantle — see ``pallas.mantle_shims``.
|
||||
"""
|
||||
try:
|
||||
resp = await client.get(
|
||||
f"{_ANTHROPIC_API}/models/{model_id}",
|
||||
f"{base_url.rstrip('/')}/models/{model_id}",
|
||||
headers={
|
||||
"x-api-key": api_key,
|
||||
"anthropic-version": "2023-06-01",
|
||||
@@ -98,24 +167,6 @@ async def _check_anthropic(client: httpx.AsyncClient, api_key: str, model_id: st
|
||||
return f"API request failed ({resp.status_code})"
|
||||
|
||||
|
||||
async def _check_openai(
|
||||
client: httpx.AsyncClient, api_key: str, model_id: str, base_url: str
|
||||
) -> str | None:
|
||||
"""Validate an OpenAI-compatible model. Returns None on success, error message on failure."""
|
||||
try:
|
||||
resp = await client.get(
|
||||
f"{base_url.rstrip('/')}/models/{model_id}",
|
||||
headers={"Authorization": f"Bearer {api_key}"},
|
||||
)
|
||||
except Exception as exc:
|
||||
return f"API unreachable ({type(exc).__name__})"
|
||||
if resp.status_code == 200:
|
||||
return None
|
||||
if resp.status_code == 404:
|
||||
return f"model '{model_id}' not found"
|
||||
return f"API request failed ({resp.status_code})"
|
||||
|
||||
|
||||
async def _list_openai_models(
|
||||
client: httpx.AsyncClient, api_key: str, base_url: str
|
||||
) -> tuple[str | None, list[str]]:
|
||||
@@ -129,96 +180,255 @@ async def _list_openai_models(
|
||||
return f"API unreachable ({type(exc).__name__})", []
|
||||
if resp.status_code != 200:
|
||||
return f"API request failed ({resp.status_code})", []
|
||||
try:
|
||||
data = resp.json()
|
||||
models = [m["id"] for m in data.get("data", []) if "id" in m]
|
||||
except Exception:
|
||||
return "response was not valid JSON", []
|
||||
models = [m["id"] for m in data.get("data", []) if isinstance(m, dict) and "id" in m]
|
||||
return None, models
|
||||
|
||||
|
||||
async def _check_generic(
|
||||
client: httpx.AsyncClient, base_url: str
|
||||
) -> str | None:
|
||||
"""Status-code-only probe against ``{base_url}/models``.
|
||||
|
||||
The generic provider targets local/on-prem OpenAI-compatible servers
|
||||
(llama.cpp, Ollama, vLLM, …) whose ``/v1/models`` payloads are not all
|
||||
identical — llama.cpp mixes an Ollama-style ``models`` list with the
|
||||
OpenAI ``data`` list, for example. We deliberately don't require the
|
||||
configured model name to appear in the response because users hot-swap
|
||||
models by name all the time; as long as the server is up and returns
|
||||
200 for its catalogue we call it ok.
|
||||
"""
|
||||
try:
|
||||
resp = await client.get(f"{base_url.rstrip('/')}/models")
|
||||
except Exception as exc:
|
||||
return f"API unreachable ({type(exc).__name__})"
|
||||
if resp.status_code == 200:
|
||||
return None
|
||||
return f"API request failed ({resp.status_code})"
|
||||
|
||||
|
||||
# ── Mantle helpers ───────────────────────────────────────────────────────────
|
||||
|
||||
|
||||
def _mantle_root_from_anthropic_base(base: str) -> str:
|
||||
"""Return the region root for a Mantle anthropic base_url.
|
||||
|
||||
Mantle publishes its inference path under ``/anthropic`` but the
|
||||
catalogue (``GET /v1/models/{wire_id}``) lives at the region root.
|
||||
Example:
|
||||
``https://bedrock-mantle.us-east-1.api.aws/anthropic``
|
||||
→ ``https://bedrock-mantle.us-east-1.api.aws``
|
||||
Any other trailing paths are returned untouched.
|
||||
"""
|
||||
stripped = base.rstrip("/")
|
||||
if stripped.endswith("/anthropic"):
|
||||
return stripped[: -len("/anthropic")]
|
||||
return stripped
|
||||
|
||||
|
||||
# ── Preflight orchestration ──────────────────────────────────────────────────
|
||||
|
||||
|
||||
async def _preflight_anthropic(
|
||||
client: httpx.AsyncClient, config: dict, secrets: dict, active_model: str
|
||||
) -> dict:
|
||||
from fast_agent.core.exceptions import ProviderKeyError
|
||||
from fast_agent.llm.provider_key_manager import ProviderKeyManager
|
||||
from pallas.mantle_shims import MANTLE_WIRE_NAMES, is_mantle_base_url
|
||||
|
||||
merged = _merge_for_key_manager(config, secrets)
|
||||
try:
|
||||
api_key = ProviderKeyManager.get_api_key("anthropic", merged)
|
||||
except ProviderKeyError as exc:
|
||||
return {"status": "error", "message": str(exc)}
|
||||
|
||||
anthropic_base = _expand_env(
|
||||
(config.get("anthropic", {}) or {}).get("base_url", "")
|
||||
) or os.environ.get("ANTHROPIC_BASE_URL", "") or _ANTHROPIC_DEFAULT_API
|
||||
|
||||
if is_mantle_base_url(anthropic_base):
|
||||
# Mantle hosts the model catalogue at the region root, not under
|
||||
# /anthropic. Wire-name translation (claude-opus-4-7 →
|
||||
# anthropic.claude-opus-4-7) keeps us consistent with mantle_shims.
|
||||
probe_base = f"{_mantle_root_from_anthropic_base(anthropic_base)}/v1"
|
||||
wire_id = MANTLE_WIRE_NAMES.get(active_model, active_model)
|
||||
err = await _check_anthropic(client, api_key, wire_id, probe_base)
|
||||
if err:
|
||||
logger.warning("anthropic (mantle, %s): %s", anthropic_base, err)
|
||||
return {"status": "error", "model": wire_id, "message": err}
|
||||
logger.info("anthropic (mantle, %s): %s ready", anthropic_base, wire_id)
|
||||
return {"status": "ok", "model": wire_id}
|
||||
|
||||
err = await _check_anthropic(client, api_key, active_model, anthropic_base)
|
||||
if err:
|
||||
logger.warning("anthropic (%s): %s", anthropic_base, err)
|
||||
return {"status": "error", "model": active_model, "message": err}
|
||||
logger.info("anthropic (%s): %s ready", anthropic_base, active_model)
|
||||
return {"status": "ok", "model": active_model}
|
||||
|
||||
|
||||
async def _preflight_openai(
|
||||
client: httpx.AsyncClient, config: dict, secrets: dict, active_model: str
|
||||
) -> dict:
|
||||
from fast_agent.core.exceptions import ProviderKeyError
|
||||
from fast_agent.llm.provider_key_manager import ProviderKeyManager
|
||||
|
||||
merged = _merge_for_key_manager(config, secrets)
|
||||
try:
|
||||
api_key = ProviderKeyManager.get_api_key("openai", merged)
|
||||
except ProviderKeyError as exc:
|
||||
return {"status": "error", "message": str(exc)}
|
||||
|
||||
openai_base = _expand_env(
|
||||
(config.get("openai", {}) or {}).get("base_url", "")
|
||||
) or os.environ.get("OPENAI_BASE_URL", "") or _OPENAI_DEFAULT_API
|
||||
|
||||
err, models = await _list_openai_models(client, api_key, openai_base)
|
||||
if err:
|
||||
logger.warning("openai (%s): %s", openai_base, err)
|
||||
return {"status": "error", "message": err}
|
||||
if active_model and active_model not in models:
|
||||
label = ", ".join(models) if models else "none"
|
||||
msg = f"model '{active_model}' not found (available: {label})"
|
||||
logger.warning("openai (%s): %s", openai_base, msg)
|
||||
return {"status": "error", "model": active_model, "message": msg}
|
||||
logger.info("openai (%s): %s ready", openai_base, active_model or "(any)")
|
||||
return {"status": "ok", "model": active_model, "models": models}
|
||||
|
||||
|
||||
async def _preflight_generic(
|
||||
client: httpx.AsyncClient, config: dict, secrets: dict, active_model: str
|
||||
) -> dict:
|
||||
# generic has a synthetic "ollama" key via ProviderKeyManager, so there's
|
||||
# nothing to authenticate against — we just need the endpoint.
|
||||
generic_base = _expand_env(
|
||||
(config.get("generic", {}) or {}).get("base_url", "")
|
||||
) or os.environ.get("GENERIC_BASE_URL", "") or _GENERIC_DEFAULT_API
|
||||
|
||||
err = await _check_generic(client, generic_base)
|
||||
if err:
|
||||
logger.warning("generic (%s): %s", generic_base, err)
|
||||
return {"status": "error", "model": active_model, "message": err}
|
||||
logger.info("generic (%s): %s ready", generic_base, active_model or "(any)")
|
||||
return {"status": "ok", "model": active_model}
|
||||
|
||||
|
||||
def _preflight_bedrock(config: dict, secrets: dict, active_model: str) -> dict:
|
||||
"""Bedrock uses the AWS credential chain — no outbound HTTP here.
|
||||
|
||||
We report ``ok`` whenever any of the usual credential sources is present
|
||||
(long-term bedrock key, explicit access key pair, or a nonempty AWS
|
||||
profile). If nothing is set we mark it degraded so Daedalus shows the
|
||||
operator *why* the first real request will fail; we don't actually call
|
||||
STS or Bedrock ourselves.
|
||||
"""
|
||||
have_bearer = bool(os.environ.get("AWS_BEARER_TOKEN_BEDROCK"))
|
||||
have_access_key = bool(os.environ.get("AWS_ACCESS_KEY_ID")) and bool(
|
||||
os.environ.get("AWS_SECRET_ACCESS_KEY")
|
||||
)
|
||||
have_profile = bool(os.environ.get("AWS_PROFILE"))
|
||||
creds_path = Path.home() / ".aws" / "credentials"
|
||||
have_file = creds_path.exists()
|
||||
|
||||
if have_bearer or have_access_key or have_profile or have_file:
|
||||
logger.info("bedrock: credentials resolvable (no preflight request issued)")
|
||||
return {"status": "ok", "model": active_model}
|
||||
|
||||
msg = "no AWS credentials found (set AWS_BEARER_TOKEN_BEDROCK or configure AWS CLI)"
|
||||
logger.warning("bedrock: %s", msg)
|
||||
return {"status": "error", "model": active_model, "message": msg}
|
||||
|
||||
|
||||
async def validate_llm_providers(timeout: float = 5.0) -> dict[str, dict]:
|
||||
"""
|
||||
Validate configured LLM provider API keys and model availability.
|
||||
Validate the configured LLM provider and populate the module-level cache
|
||||
read by :func:`get_health` on every MCP ``get_health`` tool call.
|
||||
|
||||
Reads fastagent.config.yaml for default_model and fastagent.secrets.yaml
|
||||
for API keys. Checks all providers that have keys configured.
|
||||
Only the *active* provider (the one named by ``default_model``) is
|
||||
preflighted — that's the one whose failure would actually break the
|
||||
agent, and it keeps the startup surface small. Other provider sections
|
||||
are ignored here even if they're configured.
|
||||
|
||||
Returns a dict keyed by provider name with validation results.
|
||||
Returns a dict keyed by provider name with validation results. Shape:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
{"anthropic": {"status": "ok", "model": "anthropic.claude-opus-4-7"}}
|
||||
{"generic": {"status": "ok", "model": "Qwen3.5-..."}}
|
||||
{"openai": {"status": "error", "message": "API request failed (401)"}}
|
||||
{"unknown": {"status": "error", "message": "unknown provider 'foo'"}}
|
||||
"""
|
||||
global _active_provider
|
||||
|
||||
_load_dotenv()
|
||||
config, secrets = _load_config()
|
||||
default_model = config.get("default_model", "")
|
||||
default_model = config.get("default_model", "") or ""
|
||||
|
||||
# Parse provider and model from "provider.model-name" format
|
||||
active_provider = default_model.split(".")[0] if "." in default_model else ""
|
||||
active_model = default_model.split(".", 1)[1] if "." in default_model else default_model
|
||||
|
||||
# Resolve API keys from secrets (expanding ${ENV_VAR} references), falling
|
||||
# back to env vars directly so that .env alone is sufficient.
|
||||
anthropic_key = _expand_env(secrets.get("anthropic", {}).get("api_key", "")) or os.environ.get("ANTHROPIC_API_KEY", "")
|
||||
openai_key = _expand_env(secrets.get("openai", {}).get("api_key", "")) or os.environ.get("OPENAI_API_KEY", "")
|
||||
openai_base = (
|
||||
_expand_env(secrets.get("openai", {}).get("base_url", ""))
|
||||
or config.get("openai", {}).get("base_url", "")
|
||||
or os.environ.get("OPENAI_BASE_URL", "")
|
||||
or _OPENAI_DEFAULT_API
|
||||
if "." not in default_model:
|
||||
msg = (
|
||||
f"default_model '{default_model}' is missing a provider prefix "
|
||||
"(expected '<provider>.<model>')"
|
||||
)
|
||||
logger.warning(msg)
|
||||
results = {"unknown": {"status": "error", "message": msg}}
|
||||
_llm_status.clear()
|
||||
_llm_status.update(results)
|
||||
_active_provider = "unknown"
|
||||
return results
|
||||
|
||||
active_provider, active_model = default_model.split(".", 1)
|
||||
|
||||
results: dict[str, dict] = {}
|
||||
|
||||
async with httpx.AsyncClient(timeout=timeout) as client:
|
||||
# ── Anthropic ────────────────────────────────────────────────────
|
||||
if anthropic_key:
|
||||
model_id = active_model if active_provider == "anthropic" else None
|
||||
if model_id:
|
||||
err = await _check_anthropic(client, anthropic_key, model_id)
|
||||
if err:
|
||||
results["anthropic"] = {"status": "error", "model": model_id, "message": err}
|
||||
logger.warning("anthropic: %s", err)
|
||||
else:
|
||||
results["anthropic"] = {"status": "ok", "model": model_id}
|
||||
logger.info("anthropic: %s ready", model_id)
|
||||
else:
|
||||
# Key is set but Anthropic isn't the active provider — just verify API access
|
||||
err = await _check_anthropic(client, anthropic_key, "claude-sonnet-4-5")
|
||||
if err and "not found" not in err:
|
||||
results["anthropic"] = {"status": "error", "message": err}
|
||||
logger.warning("anthropic: %s", err)
|
||||
else:
|
||||
results["anthropic"] = {"status": "ok"}
|
||||
logger.info("anthropic: API key valid")
|
||||
elif active_provider == "anthropic":
|
||||
results["anthropic"] = {"status": "error", "message": "API key not configured"}
|
||||
logger.warning("anthropic: API key not configured")
|
||||
|
||||
# ── OpenAI ───────────────────────────────────────────────────────
|
||||
if openai_key:
|
||||
model_id = active_model if active_provider == "openai" else None
|
||||
err, models = await _list_openai_models(client, openai_key, openai_base)
|
||||
if err:
|
||||
results["openai"] = {"status": "error", "message": err}
|
||||
logger.warning("openai (%s): %s", openai_base, err)
|
||||
elif model_id:
|
||||
if model_id in models:
|
||||
results["openai"] = {"status": "ok", "model": model_id}
|
||||
logger.info("openai (%s): %s ready", openai_base, model_id)
|
||||
else:
|
||||
label = ", ".join(models) if models else "none"
|
||||
results["openai"] = {"status": "error", "model": model_id, "message": f"model '{model_id}' not found (available: {label})"}
|
||||
logger.warning("openai (%s): model '%s' not found (available: %s)", openai_base, model_id, label)
|
||||
else:
|
||||
results["openai"] = {"status": "ok", "models": models}
|
||||
label = ", ".join(models) if models else "no models loaded"
|
||||
logger.info("openai (%s): %s", openai_base, label)
|
||||
if active_provider == "anthropic":
|
||||
results["anthropic"] = await _preflight_anthropic(
|
||||
client, config, secrets, active_model
|
||||
)
|
||||
elif active_provider == "openai":
|
||||
results["openai"] = {"status": "error", "message": "API key not configured"}
|
||||
logger.warning("openai: API key not configured")
|
||||
results["openai"] = await _preflight_openai(
|
||||
client, config, secrets, active_model
|
||||
)
|
||||
elif active_provider == "generic":
|
||||
results["generic"] = await _preflight_generic(
|
||||
client, config, secrets, active_model
|
||||
)
|
||||
elif active_provider == "bedrock":
|
||||
results["bedrock"] = _preflight_bedrock(config, secrets, active_model)
|
||||
else:
|
||||
# Known to fast-agent? Surface that gap explicitly rather than
|
||||
# silently reporting "error" from an empty dict lookup later.
|
||||
try:
|
||||
from fast_agent.llm.provider_types import Provider
|
||||
|
||||
Provider(active_provider) # raises ValueError if unknown
|
||||
msg = (
|
||||
f"preflight for provider '{active_provider}' is not "
|
||||
"implemented in pallas.health; LLM health will be "
|
||||
"validated on first inference call"
|
||||
)
|
||||
logger.info("%s: %s", active_provider, msg)
|
||||
results[active_provider] = {
|
||||
"status": "ok",
|
||||
"model": active_model,
|
||||
"message": msg,
|
||||
}
|
||||
except ValueError:
|
||||
msg = f"unknown provider '{active_provider}' in default_model"
|
||||
logger.warning(msg)
|
||||
results[active_provider] = {"status": "error", "message": msg}
|
||||
|
||||
_llm_status.clear()
|
||||
_llm_status.update(results)
|
||||
global _active_provider
|
||||
_active_provider = active_provider
|
||||
return results
|
||||
|
||||
|
||||
# ── Downstream MCP server probing ────────────────────────────────────────────
|
||||
|
||||
|
||||
async def check_downstream_health(
|
||||
servers: dict[str, dict], timeout: float = 3.0
|
||||
) -> dict:
|
||||
@@ -313,9 +523,24 @@ def register_health_tool(mcp_server, servers: dict[str, dict]) -> None:
|
||||
async def get_health() -> str:
|
||||
result = await check_downstream_health(servers)
|
||||
# Include LLM provider status from startup preflight (active provider only)
|
||||
active = _llm_status.get(_active_provider, {})
|
||||
if active.get("status") != "ok" and _active_provider:
|
||||
err_msg = f"LLM: {_active_provider}: {active.get('message', 'error')}"
|
||||
if _active_provider:
|
||||
active = _llm_status.get(_active_provider)
|
||||
if active is None:
|
||||
# Should be unreachable after the rewrite (validate_llm_providers
|
||||
# always populates _llm_status for _active_provider). Keep a
|
||||
# belt-and-braces path so a future refactor can't regress into
|
||||
# silently reporting "error".
|
||||
err_msg = (
|
||||
f"LLM: {_active_provider}: provider not preflighted"
|
||||
)
|
||||
result["status"] = "degraded"
|
||||
existing = result.get("message", "")
|
||||
result["message"] = f"{existing}; {err_msg}" if existing else err_msg
|
||||
elif active.get("status") != "ok":
|
||||
err_msg = (
|
||||
f"LLM: {_active_provider}: "
|
||||
f"{active.get('message', 'unknown error')}"
|
||||
)
|
||||
result["status"] = "degraded"
|
||||
existing = result.get("message", "")
|
||||
result["message"] = f"{existing}; {err_msg}" if existing else err_msg
|
||||
|
||||
522
tests/test_health.py
Normal file
522
tests/test_health.py
Normal file
@@ -0,0 +1,522 @@
|
||||
"""Tests for pallas.health — per-provider preflight dispatch.
|
||||
|
||||
Covers the matrix documented in ``pallas/pallas/health.py``:
|
||||
|
||||
- ``anthropic`` (direct, Mantle)
|
||||
- ``openai``
|
||||
- ``generic``
|
||||
- ``bedrock`` (presence-only, no HTTP)
|
||||
- unknown / malformed provider name
|
||||
|
||||
All HTTP is faked with ``httpx.MockTransport`` so nothing touches the network.
|
||||
Tests use ``asyncio.run`` directly to match the existing convention in
|
||||
``tests/test_mantle_shims.py`` (pallas has no pytest-asyncio dependency).
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import asyncio
|
||||
from pathlib import Path
|
||||
|
||||
import httpx
|
||||
import pytest
|
||||
|
||||
from pallas import health
|
||||
|
||||
|
||||
def _run(coro):
|
||||
return asyncio.run(coro)
|
||||
|
||||
|
||||
# ── Helpers ──────────────────────────────────────────────────────────────────
|
||||
|
||||
|
||||
def _patch_httpx(monkeypatch: pytest.MonkeyPatch, handler) -> None:
|
||||
"""Replace ``health.httpx.AsyncClient`` so validate_llm_providers uses the mock."""
|
||||
original_client = httpx.AsyncClient
|
||||
|
||||
def patched_client(*args, **kwargs):
|
||||
kwargs["transport"] = httpx.MockTransport(handler)
|
||||
return original_client(*args, **kwargs)
|
||||
|
||||
monkeypatch.setattr(health.httpx, "AsyncClient", patched_client)
|
||||
|
||||
|
||||
def _patch_httpx_raising(monkeypatch: pytest.MonkeyPatch) -> None:
|
||||
"""Install a transport that raises on any request — used to prove that
|
||||
bedrock / unknown paths make no HTTP call at all."""
|
||||
def handler(request: httpx.Request) -> httpx.Response:
|
||||
raise AssertionError(
|
||||
f"no HTTP call should be made, but got {request.method} {request.url}"
|
||||
)
|
||||
|
||||
_patch_httpx(monkeypatch, handler)
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def workspace(tmp_path: Path, monkeypatch: pytest.MonkeyPatch) -> Path:
|
||||
"""Chdir into a clean temp workspace and isolate env variables.
|
||||
|
||||
``validate_llm_providers`` reads ``fastagent.config.yaml`` /
|
||||
``fastagent.secrets.yaml`` from cwd and also consults env vars for
|
||||
fallback; each test starts with a clean slate.
|
||||
"""
|
||||
monkeypatch.chdir(tmp_path)
|
||||
for var in (
|
||||
"ANTHROPIC_API_KEY",
|
||||
"ANTHROPIC_BASE_URL",
|
||||
"OPENAI_API_KEY",
|
||||
"OPENAI_BASE_URL",
|
||||
"GENERIC_API_KEY",
|
||||
"GENERIC_BASE_URL",
|
||||
"AWS_BEARER_TOKEN_BEDROCK",
|
||||
"AWS_ACCESS_KEY_ID",
|
||||
"AWS_SECRET_ACCESS_KEY",
|
||||
"AWS_PROFILE",
|
||||
):
|
||||
monkeypatch.delenv(var, raising=False)
|
||||
return tmp_path
|
||||
|
||||
|
||||
# ── _mantle_root_from_anthropic_base ────────────────────────────────────────
|
||||
|
||||
|
||||
@pytest.mark.parametrize(
|
||||
"base,expected",
|
||||
[
|
||||
(
|
||||
"https://bedrock-mantle.us-east-1.api.aws/anthropic",
|
||||
"https://bedrock-mantle.us-east-1.api.aws",
|
||||
),
|
||||
(
|
||||
"https://bedrock-mantle.us-east-1.api.aws/anthropic/",
|
||||
"https://bedrock-mantle.us-east-1.api.aws",
|
||||
),
|
||||
(
|
||||
"https://bedrock-mantle.us-east-1.api.aws",
|
||||
"https://bedrock-mantle.us-east-1.api.aws",
|
||||
),
|
||||
(
|
||||
"https://example.com/proxy/anthropic",
|
||||
"https://example.com/proxy",
|
||||
),
|
||||
],
|
||||
)
|
||||
def test_mantle_root_from_anthropic_base(base: str, expected: str) -> None:
|
||||
assert health._mantle_root_from_anthropic_base(base) == expected
|
||||
|
||||
|
||||
# ── _check_anthropic (direct + Mantle share this probe) ──────────────────────
|
||||
|
||||
|
||||
def test_check_anthropic_success_direct() -> None:
|
||||
captured: list[httpx.Request] = []
|
||||
|
||||
def handler(request: httpx.Request) -> httpx.Response:
|
||||
captured.append(request)
|
||||
return httpx.Response(200, json={"id": "claude-sonnet-4-5"})
|
||||
|
||||
async def go() -> str | None:
|
||||
async with httpx.AsyncClient(transport=httpx.MockTransport(handler)) as client:
|
||||
return await health._check_anthropic(
|
||||
client,
|
||||
"sk-ant-real",
|
||||
"claude-sonnet-4-5",
|
||||
"https://api.anthropic.com/v1",
|
||||
)
|
||||
|
||||
assert _run(go()) is None
|
||||
assert str(captured[0].url) == "https://api.anthropic.com/v1/models/claude-sonnet-4-5"
|
||||
assert captured[0].headers["x-api-key"] == "sk-ant-real"
|
||||
assert captured[0].headers["anthropic-version"] == "2023-06-01"
|
||||
|
||||
|
||||
def test_check_anthropic_success_mantle_root() -> None:
|
||||
captured: list[httpx.Request] = []
|
||||
|
||||
def handler(request: httpx.Request) -> httpx.Response:
|
||||
captured.append(request)
|
||||
return httpx.Response(200, json={"id": "anthropic.claude-opus-4-7"})
|
||||
|
||||
async def go() -> str | None:
|
||||
async with httpx.AsyncClient(transport=httpx.MockTransport(handler)) as client:
|
||||
return await health._check_anthropic(
|
||||
client,
|
||||
"sk-bedrock-fake",
|
||||
"anthropic.claude-opus-4-7",
|
||||
"https://bedrock-mantle.us-east-1.api.aws/v1",
|
||||
)
|
||||
|
||||
assert _run(go()) is None
|
||||
# Must hit the Mantle region root, not `/anthropic/v1/models/...`.
|
||||
assert str(captured[0].url) == (
|
||||
"https://bedrock-mantle.us-east-1.api.aws/v1"
|
||||
"/models/anthropic.claude-opus-4-7"
|
||||
)
|
||||
|
||||
|
||||
def test_check_anthropic_401() -> None:
|
||||
def handler(request: httpx.Request) -> httpx.Response:
|
||||
return httpx.Response(401, json={"error": "invalid_api_key"})
|
||||
|
||||
async def go() -> str | None:
|
||||
async with httpx.AsyncClient(transport=httpx.MockTransport(handler)) as client:
|
||||
return await health._check_anthropic(
|
||||
client,
|
||||
"bad-key",
|
||||
"claude-sonnet-4-5",
|
||||
"https://api.anthropic.com/v1",
|
||||
)
|
||||
|
||||
assert _run(go()) == "API request failed (401)"
|
||||
|
||||
|
||||
def test_check_anthropic_404_model_missing() -> None:
|
||||
def handler(request: httpx.Request) -> httpx.Response:
|
||||
return httpx.Response(404, json={})
|
||||
|
||||
async def go() -> str | None:
|
||||
async with httpx.AsyncClient(transport=httpx.MockTransport(handler)) as client:
|
||||
return await health._check_anthropic(
|
||||
client,
|
||||
"key",
|
||||
"claude-foo",
|
||||
"https://api.anthropic.com/v1",
|
||||
)
|
||||
|
||||
assert _run(go()) == "model 'claude-foo' not found"
|
||||
|
||||
|
||||
# ── validate_llm_providers: anthropic direct ─────────────────────────────────
|
||||
|
||||
|
||||
def test_validate_anthropic_direct_ok(
|
||||
workspace: Path, monkeypatch: pytest.MonkeyPatch
|
||||
) -> None:
|
||||
(workspace / "fastagent.config.yaml").write_text(
|
||||
"default_model: anthropic.claude-sonnet-4-5\n"
|
||||
)
|
||||
(workspace / "fastagent.secrets.yaml").write_text(
|
||||
'anthropic:\n api_key: "sk-ant-real"\n'
|
||||
)
|
||||
|
||||
captured: list[httpx.Request] = []
|
||||
|
||||
def handler(request: httpx.Request) -> httpx.Response:
|
||||
captured.append(request)
|
||||
return httpx.Response(200, json={"id": "claude-sonnet-4-5"})
|
||||
|
||||
_patch_httpx(monkeypatch, handler)
|
||||
|
||||
assert _run(health.validate_llm_providers(timeout=1.0)) == {
|
||||
"anthropic": {"status": "ok", "model": "claude-sonnet-4-5"}
|
||||
}
|
||||
assert str(captured[0].url) == "https://api.anthropic.com/v1/models/claude-sonnet-4-5"
|
||||
|
||||
|
||||
def test_validate_anthropic_missing_key(
|
||||
workspace: Path, monkeypatch: pytest.MonkeyPatch
|
||||
) -> None:
|
||||
(workspace / "fastagent.config.yaml").write_text(
|
||||
"default_model: anthropic.claude-sonnet-4-5\n"
|
||||
)
|
||||
# No secrets file at all → ProviderKeyManager raises ProviderKeyError.
|
||||
_patch_httpx_raising(monkeypatch)
|
||||
|
||||
results = _run(health.validate_llm_providers(timeout=1.0))
|
||||
assert results["anthropic"]["status"] == "error"
|
||||
assert "API key" in results["anthropic"]["message"]
|
||||
|
||||
|
||||
# ── validate_llm_providers: anthropic via Mantle ─────────────────────────────
|
||||
|
||||
|
||||
def test_validate_anthropic_mantle_uses_region_root(
|
||||
workspace: Path, monkeypatch: pytest.MonkeyPatch
|
||||
) -> None:
|
||||
(workspace / "fastagent.config.yaml").write_text(
|
||||
"default_model: anthropic.claude-opus-4-7\n"
|
||||
"anthropic:\n"
|
||||
' base_url: "https://bedrock-mantle.us-east-1.api.aws/anthropic"\n'
|
||||
)
|
||||
(workspace / "fastagent.secrets.yaml").write_text(
|
||||
'anthropic:\n api_key: "sk-bedrock-fake"\n'
|
||||
)
|
||||
|
||||
captured: list[httpx.Request] = []
|
||||
|
||||
def handler(request: httpx.Request) -> httpx.Response:
|
||||
captured.append(request)
|
||||
return httpx.Response(200, json={"id": "anthropic.claude-opus-4-7"})
|
||||
|
||||
_patch_httpx(monkeypatch, handler)
|
||||
|
||||
assert _run(health.validate_llm_providers(timeout=1.0)) == {
|
||||
"anthropic": {"status": "ok", "model": "anthropic.claude-opus-4-7"}
|
||||
}
|
||||
# Must strip the `/anthropic` suffix AND apply the wire-name prefix.
|
||||
assert str(captured[0].url) == (
|
||||
"https://bedrock-mantle.us-east-1.api.aws/v1"
|
||||
"/models/anthropic.claude-opus-4-7"
|
||||
)
|
||||
|
||||
|
||||
def test_validate_anthropic_mantle_401(
|
||||
workspace: Path, monkeypatch: pytest.MonkeyPatch
|
||||
) -> None:
|
||||
(workspace / "fastagent.config.yaml").write_text(
|
||||
"default_model: anthropic.claude-opus-4-7\n"
|
||||
"anthropic:\n"
|
||||
' base_url: "https://bedrock-mantle.us-east-1.api.aws/anthropic"\n'
|
||||
)
|
||||
(workspace / "fastagent.secrets.yaml").write_text(
|
||||
'anthropic:\n api_key: "sk-bogus"\n'
|
||||
)
|
||||
|
||||
captured: list[httpx.Request] = []
|
||||
|
||||
def handler(request: httpx.Request) -> httpx.Response:
|
||||
captured.append(request)
|
||||
return httpx.Response(401, json={"error": "unauthorized"})
|
||||
|
||||
_patch_httpx(monkeypatch, handler)
|
||||
|
||||
results = _run(health.validate_llm_providers(timeout=1.0))
|
||||
assert results == {
|
||||
"anthropic": {
|
||||
"status": "error",
|
||||
"model": "anthropic.claude-opus-4-7",
|
||||
"message": "API request failed (401)",
|
||||
}
|
||||
}
|
||||
assert "bedrock-mantle" in str(captured[0].url)
|
||||
|
||||
|
||||
# ── validate_llm_providers: openai ───────────────────────────────────────────
|
||||
|
||||
|
||||
def test_validate_openai_model_in_list(
|
||||
workspace: Path, monkeypatch: pytest.MonkeyPatch
|
||||
) -> None:
|
||||
(workspace / "fastagent.config.yaml").write_text(
|
||||
"default_model: openai.gpt-4o-mini\n"
|
||||
)
|
||||
(workspace / "fastagent.secrets.yaml").write_text(
|
||||
'openai:\n api_key: "sk-openai-real"\n'
|
||||
)
|
||||
|
||||
def handler(request: httpx.Request) -> httpx.Response:
|
||||
return httpx.Response(
|
||||
200,
|
||||
json={"data": [{"id": "gpt-4o-mini"}, {"id": "gpt-4o"}]},
|
||||
)
|
||||
|
||||
_patch_httpx(monkeypatch, handler)
|
||||
|
||||
results = _run(health.validate_llm_providers(timeout=1.0))
|
||||
assert results["openai"]["status"] == "ok"
|
||||
assert results["openai"]["model"] == "gpt-4o-mini"
|
||||
|
||||
|
||||
def test_validate_openai_model_missing_from_list(
|
||||
workspace: Path, monkeypatch: pytest.MonkeyPatch
|
||||
) -> None:
|
||||
(workspace / "fastagent.config.yaml").write_text(
|
||||
"default_model: openai.gpt-nonexistent\n"
|
||||
)
|
||||
(workspace / "fastagent.secrets.yaml").write_text(
|
||||
'openai:\n api_key: "sk-openai-real"\n'
|
||||
)
|
||||
|
||||
def handler(request: httpx.Request) -> httpx.Response:
|
||||
return httpx.Response(200, json={"data": [{"id": "gpt-4o-mini"}]})
|
||||
|
||||
_patch_httpx(monkeypatch, handler)
|
||||
|
||||
results = _run(health.validate_llm_providers(timeout=1.0))
|
||||
assert results["openai"]["status"] == "error"
|
||||
assert "gpt-nonexistent" in results["openai"]["message"]
|
||||
assert "gpt-4o-mini" in results["openai"]["message"] # includes available list
|
||||
|
||||
|
||||
# ── validate_llm_providers: generic ──────────────────────────────────────────
|
||||
|
||||
|
||||
def test_validate_generic_ok_regardless_of_body(
|
||||
workspace: Path, monkeypatch: pytest.MonkeyPatch
|
||||
) -> None:
|
||||
"""llama.cpp returns a non-OpenAI-shaped ``/v1/models`` payload; we only
|
||||
care that the endpoint responds 200."""
|
||||
(workspace / "fastagent.config.yaml").write_text(
|
||||
"default_model: generic.Qwen3.5-35B-A3B-UD-Q4_K_XL.gguf\n"
|
||||
"generic:\n"
|
||||
' base_url: "http://nyx.helu.ca:22079/v1"\n'
|
||||
)
|
||||
# generic requires no api_key; no secrets file needed.
|
||||
|
||||
captured: list[httpx.Request] = []
|
||||
|
||||
def handler(request: httpx.Request) -> httpx.Response:
|
||||
captured.append(request)
|
||||
# Match llama.cpp's shape (Ollama-style `models` alongside OpenAI `data`).
|
||||
return httpx.Response(
|
||||
200,
|
||||
json={
|
||||
"models": [{"name": "Qwen3.5-35B-A3B-UD-Q4_K_XL.gguf"}],
|
||||
"object": "list",
|
||||
"data": [{"id": "Qwen3.5-35B-A3B-UD-Q4_K_XL.gguf"}],
|
||||
},
|
||||
)
|
||||
|
||||
_patch_httpx(monkeypatch, handler)
|
||||
|
||||
assert _run(health.validate_llm_providers(timeout=1.0)) == {
|
||||
"generic": {
|
||||
"status": "ok",
|
||||
"model": "Qwen3.5-35B-A3B-UD-Q4_K_XL.gguf",
|
||||
}
|
||||
}
|
||||
assert str(captured[0].url) == "http://nyx.helu.ca:22079/v1/models"
|
||||
|
||||
|
||||
def test_validate_generic_unreachable(
|
||||
workspace: Path, monkeypatch: pytest.MonkeyPatch
|
||||
) -> None:
|
||||
(workspace / "fastagent.config.yaml").write_text(
|
||||
"default_model: generic.Qwen3.5-35B-A3B-UD-Q4_K_XL.gguf\n"
|
||||
"generic:\n"
|
||||
' base_url: "http://nyx.helu.ca:22079/v1"\n'
|
||||
)
|
||||
|
||||
def handler(request: httpx.Request) -> httpx.Response:
|
||||
raise httpx.ConnectError("connection refused")
|
||||
|
||||
_patch_httpx(monkeypatch, handler)
|
||||
|
||||
results = _run(health.validate_llm_providers(timeout=1.0))
|
||||
assert results["generic"]["status"] == "error"
|
||||
assert "unreachable" in results["generic"]["message"].lower()
|
||||
|
||||
|
||||
def test_validate_generic_503(
|
||||
workspace: Path, monkeypatch: pytest.MonkeyPatch
|
||||
) -> None:
|
||||
(workspace / "fastagent.config.yaml").write_text(
|
||||
"default_model: generic.Qwen3.5-35B-A3B-UD-Q4_K_XL.gguf\n"
|
||||
"generic:\n"
|
||||
' base_url: "http://nyx.helu.ca:22079/v1"\n'
|
||||
)
|
||||
|
||||
def handler(request: httpx.Request) -> httpx.Response:
|
||||
return httpx.Response(503)
|
||||
|
||||
_patch_httpx(monkeypatch, handler)
|
||||
|
||||
results = _run(health.validate_llm_providers(timeout=1.0))
|
||||
assert results["generic"] == {
|
||||
"status": "error",
|
||||
"model": "Qwen3.5-35B-A3B-UD-Q4_K_XL.gguf",
|
||||
"message": "API request failed (503)",
|
||||
}
|
||||
|
||||
|
||||
# ── validate_llm_providers: bedrock (no HTTP) ────────────────────────────────
|
||||
|
||||
|
||||
def test_validate_bedrock_ok_with_bearer(
|
||||
workspace: Path, monkeypatch: pytest.MonkeyPatch
|
||||
) -> None:
|
||||
(workspace / "fastagent.config.yaml").write_text(
|
||||
"default_model: bedrock.anthropic.claude-sonnet-4-6\n"
|
||||
)
|
||||
monkeypatch.setenv("AWS_BEARER_TOKEN_BEDROCK", "abs-fake")
|
||||
_patch_httpx_raising(monkeypatch) # any HTTP call is a test failure
|
||||
|
||||
results = _run(health.validate_llm_providers(timeout=1.0))
|
||||
assert results["bedrock"]["status"] == "ok"
|
||||
assert results["bedrock"]["model"] == "anthropic.claude-sonnet-4-6"
|
||||
|
||||
|
||||
def test_validate_bedrock_no_credentials(
|
||||
workspace: Path, monkeypatch: pytest.MonkeyPatch
|
||||
) -> None:
|
||||
(workspace / "fastagent.config.yaml").write_text(
|
||||
"default_model: bedrock.anthropic.claude-sonnet-4-6\n"
|
||||
)
|
||||
# The real user has an ~/.aws/credentials file which would cause a false
|
||||
# positive; redirect HOME so Path.home() / ".aws" does not exist.
|
||||
monkeypatch.setenv("HOME", str(workspace))
|
||||
_patch_httpx_raising(monkeypatch)
|
||||
|
||||
results = _run(health.validate_llm_providers(timeout=1.0))
|
||||
assert results["bedrock"]["status"] == "error"
|
||||
assert "AWS credentials" in results["bedrock"]["message"]
|
||||
|
||||
|
||||
# ── validate_llm_providers: malformed / unknown provider ─────────────────────
|
||||
|
||||
|
||||
def test_validate_default_model_missing_prefix(
|
||||
workspace: Path, monkeypatch: pytest.MonkeyPatch
|
||||
) -> None:
|
||||
(workspace / "fastagent.config.yaml").write_text("default_model: just-a-name\n")
|
||||
_patch_httpx_raising(monkeypatch)
|
||||
|
||||
results = _run(health.validate_llm_providers(timeout=1.0))
|
||||
assert results["unknown"]["status"] == "error"
|
||||
assert "provider prefix" in results["unknown"]["message"]
|
||||
|
||||
|
||||
def test_validate_unknown_provider(
|
||||
workspace: Path, monkeypatch: pytest.MonkeyPatch
|
||||
) -> None:
|
||||
(workspace / "fastagent.config.yaml").write_text(
|
||||
"default_model: imaginary.some-model\n"
|
||||
)
|
||||
_patch_httpx_raising(monkeypatch)
|
||||
|
||||
results = _run(health.validate_llm_providers(timeout=1.0))
|
||||
assert results["imaginary"]["status"] == "error"
|
||||
assert "unknown provider" in results["imaginary"]["message"]
|
||||
|
||||
|
||||
# ── get_health() payload ─────────────────────────────────────────────────────
|
||||
|
||||
|
||||
def test_get_health_reports_generic_ok(
|
||||
workspace: Path, monkeypatch: pytest.MonkeyPatch
|
||||
) -> None:
|
||||
"""End-to-end: after a successful generic preflight, get_health() should
|
||||
return status=ok with no LLM error in the message. This is the exact
|
||||
regression case that was showing up as ``LLM: generic: error`` in Daedalus.
|
||||
"""
|
||||
(workspace / "fastagent.config.yaml").write_text(
|
||||
"default_model: generic.Qwen3.5-35B-A3B-UD-Q4_K_XL.gguf\n"
|
||||
"generic:\n"
|
||||
' base_url: "http://nyx.helu.ca:22079/v1"\n'
|
||||
)
|
||||
|
||||
def handler(request: httpx.Request) -> httpx.Response:
|
||||
return httpx.Response(200, json={"data": []})
|
||||
|
||||
_patch_httpx(monkeypatch, handler)
|
||||
_run(health.validate_llm_providers(timeout=1.0))
|
||||
|
||||
# Simulate the MCP `get_health` tool by calling check_downstream_health
|
||||
# with an empty server map and composing the message the same way
|
||||
# register_health_tool does.
|
||||
async def call() -> dict:
|
||||
result = await health.check_downstream_health({}, timeout=1.0)
|
||||
active = health._llm_status.get(health._active_provider)
|
||||
if active is not None and active.get("status") != "ok":
|
||||
result["status"] = "degraded"
|
||||
existing = result.get("message", "")
|
||||
msg = (
|
||||
f"LLM: {health._active_provider}: "
|
||||
f"{active.get('message', 'unknown error')}"
|
||||
)
|
||||
result["message"] = f"{existing}; {msg}" if existing else msg
|
||||
return result
|
||||
|
||||
final = _run(call())
|
||||
assert final["status"] == "ok"
|
||||
assert "LLM" not in final.get("message", "")
|
||||
Reference in New Issue
Block a user