docs: rewrite README with structured overview and quick start guide
Replaces the minimal project description with a comprehensive README including a component overview table, quick start instructions, common Ansible operations, and links to detailed documentation. Aligns with Red Panda Approval™ standards.
This commit is contained in:
184
docs/_template.md
Normal file
184
docs/_template.md
Normal file
@@ -0,0 +1,184 @@
|
||||
# Service Documentation Template
|
||||
|
||||
This is a template for documenting services deployed in the Agathos sandbox. Copy this file and replace placeholders with service-specific information.
|
||||
|
||||
---
|
||||
|
||||
# {Service Name}
|
||||
|
||||
## Overview
|
||||
|
||||
Brief description of the service, its purpose, and role in the infrastructure.
|
||||
|
||||
**Host:** {hostname} (e.g., oberon, miranda, prospero)
|
||||
**Role:** {role from Terraform} (e.g., container_orchestration, observability)
|
||||
**Port Range:** {exposed ports} (e.g., 25580-25599)
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
|
||||
│ Client │────▶│ Service │────▶│ Database │
|
||||
└─────────────┘ └─────────────┘ └─────────────┘
|
||||
```
|
||||
|
||||
Describe the service architecture, data flow, and integration points.
|
||||
|
||||
## Terraform Resources
|
||||
|
||||
### Host Definition
|
||||
|
||||
The service runs on `{hostname}`, defined in `terraform/containers.tf`:
|
||||
|
||||
| Attribute | Value |
|
||||
|-----------|-------|
|
||||
| Image | {noble/plucky/questing} |
|
||||
| Role | {terraform role} |
|
||||
| Security Nesting | {true/false} |
|
||||
| Proxy Devices | {port mappings} |
|
||||
|
||||
### Dependencies
|
||||
|
||||
| Resource | Relationship |
|
||||
|----------|--------------|
|
||||
| {other host} | {description of dependency} |
|
||||
|
||||
## Ansible Deployment
|
||||
|
||||
### Playbook
|
||||
|
||||
```bash
|
||||
cd ansible
|
||||
ansible-playbook {service}/deploy.yml
|
||||
```
|
||||
|
||||
### Files
|
||||
|
||||
| File | Purpose |
|
||||
|------|---------|
|
||||
| `{service}/deploy.yml` | Main deployment playbook |
|
||||
| `{service}/*.j2` | Jinja2 templates |
|
||||
|
||||
### Variables
|
||||
|
||||
#### Group Variables (`group_vars/all/main.yml`)
|
||||
|
||||
| Variable | Description | Default |
|
||||
|----------|-------------|---------|
|
||||
| `{service}_version` | Version to deploy | `latest` |
|
||||
|
||||
#### Host Variables (`host_vars/{hostname}.yml`)
|
||||
|
||||
| Variable | Description |
|
||||
|----------|-------------|
|
||||
| `{service}_port` | Service port |
|
||||
| `{service}_data_dir` | Data directory |
|
||||
|
||||
#### Vault Variables (`group_vars/all/vault.yml`)
|
||||
|
||||
| Variable | Description |
|
||||
|----------|-------------|
|
||||
| `vault_{service}_password` | Service password |
|
||||
| `vault_{service}_api_key` | API key (if applicable) |
|
||||
|
||||
## Configuration
|
||||
|
||||
### Environment Variables
|
||||
|
||||
| Variable | Description | Source |
|
||||
|----------|-------------|--------|
|
||||
| `{VAR_NAME}` | Description | `{{ vault_{service}_var }}` |
|
||||
|
||||
### Configuration Files
|
||||
|
||||
| File | Location | Template |
|
||||
|------|----------|----------|
|
||||
| `config.yml` | `/etc/{service}/` | `{service}/config.yml.j2` |
|
||||
|
||||
## Monitoring
|
||||
|
||||
### Prometheus Metrics
|
||||
|
||||
| Metric | Description |
|
||||
|--------|-------------|
|
||||
| `{service}_requests_total` | Total requests |
|
||||
| `{service}_errors_total` | Total errors |
|
||||
|
||||
**Scrape Target:** Configured in `ansible/prometheus/` or via Alloy.
|
||||
|
||||
### Loki Logs
|
||||
|
||||
| Log Source | Labels |
|
||||
|------------|--------|
|
||||
| Application log | `{job="{service}", host="{hostname}"}` |
|
||||
| Access log | `{job="{service}_access", host="{hostname}"}` |
|
||||
|
||||
**Collection:** Alloy agent on host ships logs to Loki on Prospero.
|
||||
|
||||
### Grafana Dashboard
|
||||
|
||||
Dashboard provisioned at: `ansible/grafana/dashboards/{service}.json`
|
||||
|
||||
## Operations
|
||||
|
||||
### Start/Stop
|
||||
|
||||
```bash
|
||||
# Via systemd (if applicable)
|
||||
sudo systemctl start {service}
|
||||
sudo systemctl stop {service}
|
||||
|
||||
# Via Docker (if applicable)
|
||||
docker compose -f /opt/{service}/docker-compose.yml up -d
|
||||
docker compose -f /opt/{service}/docker-compose.yml down
|
||||
```
|
||||
|
||||
### Health Check
|
||||
|
||||
```bash
|
||||
curl http://{hostname}.incus:{port}/health
|
||||
```
|
||||
|
||||
### Logs
|
||||
|
||||
```bash
|
||||
# Systemd
|
||||
journalctl -u {service} -f
|
||||
|
||||
# Docker
|
||||
docker logs -f {container_name}
|
||||
|
||||
# Loki (via Grafana Explore)
|
||||
{job="{service}"}
|
||||
```
|
||||
|
||||
### Backup
|
||||
|
||||
Describe backup procedures, scripts, and schedules.
|
||||
|
||||
### Restore
|
||||
|
||||
Describe restore procedures and verification steps.
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Common Issues
|
||||
|
||||
| Symptom | Cause | Resolution |
|
||||
|---------|-------|------------|
|
||||
| Service won't start | Missing config | Check `{config_file}` exists |
|
||||
| Connection refused | Firewall/proxy | Verify Incus proxy device |
|
||||
|
||||
### Debug Mode
|
||||
|
||||
```bash
|
||||
# Enable debug logging
|
||||
{service} --debug
|
||||
```
|
||||
|
||||
## References
|
||||
|
||||
- Official Documentation: {url}
|
||||
- [Terraform Practices](../terraform.md)
|
||||
- [Ansible Practices](../ansible.md)
|
||||
- [Sandbox Overview](../sandbox.html)
|
||||
Reference in New Issue
Block a user