docs: rewrite README with structured overview and quick start guide

Replaces the minimal project description with a comprehensive README
including a component overview table, quick start instructions, common
Ansible operations, and links to detailed documentation. Aligns with
Red Panda Approval™ standards.
This commit is contained in:
2026-03-03 12:49:06 +00:00
parent c7be03a743
commit b4d60f2f38
219 changed files with 34586 additions and 2 deletions

202
docs/kernos.md Normal file
View File

@@ -0,0 +1,202 @@
# Kernos Service Documentation
HTTP-enabled MCP shell server using FastMCP. Wraps the existing `mcp-shell-server` execution logic with FastMCP's HTTP transport for remote AI agent access.
## Overview
| Property | Value |
|----------|-------|
| **Host** | caliban.incus |
| **Port** | 22021 |
| **Service Type** | Systemd service (non-Docker) |
| **Repository** | `ssh://robert@clio.helu.ca:18677/mnt/dev/kernos` |
## Features
- **HTTP Transport**: Accessible via URL instead of stdio
- **Health Endpoints**: `/live`, `/ready`, `/health` for Kubernetes-style probes
- **Prometheus Metrics**: `/metrics` endpoint for monitoring
- **JSON Structured Logging**: Production-ready log format with correlation IDs
- **Full Security**: Command whitelisting inherited from `mcp-shell-server`
## Endpoints
| Endpoint | Method | Description |
|----------|--------|-------------|
| `/mcp/` | POST | MCP protocol endpoint (FastMCP handles this) |
| `/live` | GET | Liveness probe - always returns 200 |
| `/ready` | GET | Readiness probe - checks executor and config |
| `/health` | GET | Combined health check |
| `/metrics` | GET | Prometheus metrics (text/plain) or JSON |
## Ansible Playbooks
### Stage Playbook
```bash
ansible-playbook kernos/stage.yml
```
Fetches the Kernos repository from clio and creates a release tarball at `~/rel/kernos_{{kernos_rel}}.tar`.
### Deploy Playbook
```bash
ansible-playbook kernos/deploy.yml
```
Deploys Kernos to caliban.incus:
1. Creates kernos user/group
2. Creates `/srv/kernos` directory
3. Transfers and extracts the staged tarball
4. Creates Python virtual environment
5. Installs package dependencies
6. Templates `.env` configuration
7. Templates systemd service file
8. Enables and starts the service
9. Validates health endpoints
## Configuration Variables
### Host Variables (`ansible/inventory/host_vars/caliban.incus.yml`)
| Variable | Default | Description |
|----------|---------|-------------|
| `kernos_user` | `kernos` | System user for the service |
| `kernos_group` | `kernos` | System group for the service |
| `kernos_directory` | `/srv/kernos` | Installation directory |
| `kernos_port` | `22021` | HTTP server port |
| `kernos_host` | `0.0.0.0` | Server bind address |
| `kernos_log_level` | `INFO` | Python log level |
| `kernos_log_format` | `json` | Log format (`json` or `text`) |
| `kernos_environment` | `production` | Environment name for logging |
| `kernos_allow_commands` | (see below) | Comma-separated command whitelist |
### Global Variables (`ansible/inventory/group_vars/all/vars.yml`)
| Variable | Default | Description |
|----------|---------|-------------|
| `kernos_rel` | `master` | Git branch/tag for staging |
## Allowed Commands
The following commands are whitelisted for execution:
```
ls, cat, head, tail, grep, find, wc, file, stat, mkdir, touch, cp, mv, rm,
chmod, pwd, tree, du, df, sed, awk, sort, uniq, cut, tr, tee, curl, wget,
ping, nc, dig, host, ps, pgrep, kill, pkill, nohup, timeout, python3, pip,
node, npm, npx, pnpm, git, make, tar, gzip, gunzip, zip, unzip, whoami, id,
uname, hostname, date, uptime, free, which, env, printenv, run-captured, jq
```
## Security
All security features are inherited from `mcp-shell-server`:
- **Command Whitelisting**: Only commands in `ALLOW_COMMANDS` can be executed
- **Shell Operator Validation**: Commands after `;`, `&&`, `||`, `|` are validated
- **Directory Validation**: Working directory must be absolute and accessible
- **No Shell Injection**: Commands executed directly without shell interpretation
The systemd service includes additional hardening:
- `NoNewPrivileges=true`
- `PrivateTmp=true`
- `ProtectSystem=strict`
- `ProtectHome=true`
- `ReadWritePaths=/tmp`
## Usage
### Testing Health Endpoints
```bash
curl http://caliban.incus:22021/health
curl http://caliban.incus:22021/ready
curl http://caliban.incus:22021/live
curl -H "Accept: text/plain" http://caliban.incus:22021/metrics
```
### MCP Client Connection
Connect using any MCP client that supports HTTP transport:
```python
from fastmcp import Client
client = Client("http://caliban.incus:22021/mcp")
async with client:
result = await client.call_tool("shell_execute", {
"command": ["ls", "-la"],
"directory": "/tmp"
})
print(result)
```
## Tool: shell_execute
Execute a shell command in a specified directory.
### Parameters
| Parameter | Type | Required | Default | Description |
|-----------|------|----------|---------|-------------|
| `command` | `list[str]` | Yes | - | Command and arguments as array |
| `directory` | `str` | No | `/tmp` | Absolute path to working directory |
| `stdin` | `str` | No | `None` | Input to pass to command |
| `timeout` | `int` | No | `None` | Timeout in seconds |
### Response
```json
{
"stdout": "command output",
"stderr": "",
"status": 0,
"execution_time": 0.123
}
```
## Monitoring
### Prometheus Metrics
The `/metrics` endpoint exposes Prometheus-compatible metrics. Add to your Prometheus configuration:
```yaml
- job_name: 'kernos'
static_configs:
- targets: ['caliban.incus:22021']
```
### Service Status
```bash
# Check service status
ssh caliban.incus sudo systemctl status kernos
# View logs
ssh caliban.incus sudo journalctl -u kernos -f
```
## Troubleshooting
### Service Won't Start
1. Check logs: `journalctl -u kernos -n 50`
2. Verify `.env` file exists and has correct permissions
3. Ensure Python venv was created successfully
4. Check that `ALLOW_COMMANDS` is set
### Health Check Failures
1. Verify the service is running: `systemctl status kernos`
2. Check if port 22021 is accessible
3. Review logs for startup errors
### Command Execution Denied
1. Verify the command is in `ALLOW_COMMANDS` whitelist
2. Check that the working directory is absolute and accessible
3. Review logs for security validation errors