335 lines
12 KiB
Markdown
335 lines
12 KiB
Markdown
# AnythingLLM
|
|
|
|
## Overview
|
|
|
|
AnythingLLM is a full-stack application that provides a unified interface for interacting with Large Language Models (LLMs). It supports multi-provider LLM access, document intelligence (RAG with pgvector), AI agents with tools, and Model Context Protocol (MCP) extensions.
|
|
|
|
**Host:** Rosalind
|
|
**Role:** go_nodejs_php_apps
|
|
**Port:** 22084 (internal), accessible via `anythingllm.ouranos.helu.ca` (HAProxy)
|
|
|
|
## Architecture
|
|
|
|
```
|
|
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
|
|
│ Client │────▶│ HAProxy │────▶│ AnythingLLM │
|
|
│ (Browser/API) │ │ (Titania) │ │ (Rosalind) │
|
|
└─────────────────┘ └─────────────────┘ └────────┬────────┘
|
|
│
|
|
┌────────────────────────────────┼────────────────────────────────┐
|
|
│ │ │
|
|
▼ ▼ ▼
|
|
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
|
|
│ PostgreSQL │ │ LLM Backend │ │ TTS Service │
|
|
│ + pgvector │ │ (pan.helu.ca) │ │ (FastKokoro) │
|
|
│ (Portia) │ │ llama-cpp │ │ pan.helu.ca │
|
|
└─────────────────┘ └─────────────────┘ └─────────────────┘
|
|
```
|
|
|
|
### Directory Structure
|
|
|
|
AnythingLLM uses a native Node.js deployment with the following directory layout:
|
|
|
|
```
|
|
/srv/anythingllm/
|
|
├── app/ # Cloned git repository
|
|
│ ├── server/ # Backend API server
|
|
│ │ ├── .env # Environment configuration
|
|
│ │ └── node_modules/
|
|
│ ├── collector/ # Document processing service
|
|
│ │ ├── hotdir -> ../hotdir # SYMLINK (critical!)
|
|
│ │ └── node_modules/
|
|
│ └── frontend/ # React frontend (built into server)
|
|
├── storage/ # Persistent data
|
|
│ ├── documents/ # Processed documents
|
|
│ ├── vector-cache/ # Embedding cache
|
|
│ └── plugins/ # MCP server configs
|
|
└── hotdir/ # Upload staging directory (actual location)
|
|
|
|
/srv/collector/
|
|
└── hotdir -> /srv/anythingllm/hotdir # SYMLINK (critical!)
|
|
```
|
|
|
|
### Hotdir Path Resolution (Critical)
|
|
|
|
The server and collector use **different path resolution** for the upload directory:
|
|
|
|
| Component | Code Location | Resolves To |
|
|
|-----------|--------------|-------------|
|
|
| **Server** (multer) | `STORAGE_DIR/../../collector/hotdir` | `/srv/collector/hotdir` |
|
|
| **Collector** | `__dirname/../hotdir` | `/srv/anythingllm/app/collector/hotdir` |
|
|
|
|
Both paths must point to the same physical directory. This is achieved with **two symlinks**:
|
|
|
|
1. `/srv/collector/hotdir` → `/srv/anythingllm/hotdir`
|
|
2. `/srv/anythingllm/app/collector/hotdir` → `/srv/anythingllm/hotdir`
|
|
|
|
⚠️ **Important**: The collector ships with an empty `hotdir/` directory. The Ansible deploy must **remove** this directory before creating the symlink, or file uploads will fail with "File does not exist in upload directory."
|
|
|
|
### Key Integrations
|
|
|
|
| Component | Host | Purpose |
|
|
|-----------|------|---------|
|
|
| PostgreSQL + pgvector | Portia | Vector database for RAG embeddings |
|
|
| LLM Provider | pan.helu.ca:22071 | Generic OpenAI-compatible llama-cpp |
|
|
| TTS Service | pan.helu.ca:22070 | FastKokoro text-to-speech |
|
|
| HAProxy | Titania | TLS termination and routing |
|
|
| Loki | Prospero | Log aggregation |
|
|
|
|
## Terraform Resources
|
|
|
|
### Host Definition
|
|
|
|
AnythingLLM runs on **Rosalind**, which is already defined in `terraform/containers.tf`:
|
|
|
|
| Attribute | Value |
|
|
|-----------|-------|
|
|
| Image | noble |
|
|
| Role | go_nodejs_php_apps |
|
|
| Security Nesting | true |
|
|
| AppArmor | unconfined |
|
|
| Port Range | 22080-22099 |
|
|
|
|
No Terraform changes required—AnythingLLM uses port 22084 within Rosalind's existing range.
|
|
|
|
## Ansible Deployment
|
|
|
|
### Playbook
|
|
|
|
```bash
|
|
cd ansible
|
|
source ~/env/ouranos/bin/activate
|
|
|
|
# Deploy PostgreSQL database first (if not already done)
|
|
ansible-playbook postgresql/deploy.yml
|
|
|
|
# Deploy AnythingLLM
|
|
ansible-playbook anythingllm/deploy.yml
|
|
|
|
# Redeploy HAProxy to pick up new backend
|
|
ansible-playbook haproxy/deploy.yml
|
|
|
|
# Redeploy Alloy to pick up new log source
|
|
ansible-playbook alloy/deploy.yml
|
|
```
|
|
|
|
### Files
|
|
|
|
| File | Purpose |
|
|
|------|---------|
|
|
| `anythingllm/deploy.yml` | Main deployment playbook |
|
|
| `anythingllm/anythingllm-server.service.j2` | Systemd service for server |
|
|
| `anythingllm/anythingllm-collector.service.j2` | Systemd service for collector |
|
|
| `anythingllm/env.j2` | Environment variables template |
|
|
|
|
### Variables
|
|
|
|
#### Host Variables (`host_vars/rosalind.incus.yml`)
|
|
|
|
| Variable | Description | Default |
|
|
|----------|-------------|---------|
|
|
| `anythingllm_user` | Service account user | `anythingllm` |
|
|
| `anythingllm_group` | Service account group | `anythingllm` |
|
|
| `anythingllm_directory` | Installation directory | `/srv/anythingllm` |
|
|
| `anythingllm_port` | Service port | `22084` |
|
|
| `anythingllm_db_host` | PostgreSQL host | `portia.incus` |
|
|
| `anythingllm_db_port` | PostgreSQL port | `5432` |
|
|
| `anythingllm_db_name` | Database name | `anythingllm` |
|
|
| `anythingllm_db_user` | Database user | `anythingllm` |
|
|
| `anythingllm_llm_base_url` | LLM API endpoint | `http://pan.helu.ca:22071/v1` |
|
|
| `anythingllm_llm_model` | Default LLM model | `llama-3-8b` |
|
|
| `anythingllm_embedding_engine` | Embedding engine | `native` |
|
|
| `anythingllm_tts_provider` | TTS provider | `openai` |
|
|
| `anythingllm_tts_endpoint` | TTS API endpoint | `http://pan.helu.ca:22070/v1` |
|
|
|
|
#### Vault Variables (`group_vars/all/vault.yml`)
|
|
|
|
| Variable | Description |
|
|
|----------|-------------|
|
|
| `vault_anythingllm_db_password` | PostgreSQL password |
|
|
| `vault_anythingllm_jwt_secret` | JWT signing secret (32+ chars) |
|
|
| `vault_anythingllm_sig_key` | Signature key (32+ chars) |
|
|
| `vault_anythingllm_sig_salt` | Signature salt (32+ chars) |
|
|
|
|
Generate secrets with:
|
|
```bash
|
|
openssl rand -hex 32
|
|
```
|
|
|
|
## Configuration
|
|
|
|
### Environment Variables
|
|
|
|
| Variable | Description | Source |
|
|
|----------|-------------|--------|
|
|
| `JWT_SECRET` | JWT signing secret | `vault_anythingllm_jwt_secret` |
|
|
| `SIG_KEY` | Signature key | `vault_anythingllm_sig_key` |
|
|
| `SIG_SALT` | Signature salt | `vault_anythingllm_sig_salt` |
|
|
| `VECTOR_DB` | Vector database type | `pgvector` |
|
|
| `PGVECTOR_CONNECTION_STRING` | PostgreSQL connection | Composed from host_vars |
|
|
| `LLM_PROVIDER` | LLM provider type | `generic-openai` |
|
|
| `EMBEDDING_ENGINE` | Embedding engine | `native` |
|
|
| `TTS_PROVIDER` | TTS provider | `openai` |
|
|
|
|
### External Access
|
|
|
|
AnythingLLM is accessible via HAProxy on Titania:
|
|
|
|
| URL | Backend |
|
|
|-----|---------|
|
|
| `https://anythingllm.ouranos.helu.ca` | `rosalind.incus:22084` |
|
|
|
|
The HAProxy backend is configured in `host_vars/titania.incus.yml`.
|
|
|
|
## Monitoring
|
|
|
|
### Loki Logs
|
|
|
|
| Log Source | Labels |
|
|
|------------|--------|
|
|
| Server logs | `{unit="anythingllm-server.service"}` |
|
|
| Collector logs | `{unit="anythingllm-collector.service"}` |
|
|
|
|
Logs are collected via systemd journal → Alloy on Rosalind → Loki on Prospero.
|
|
|
|
**Grafana Query:**
|
|
```logql
|
|
{unit=~"anythingllm.*"} |= ``
|
|
```
|
|
|
|
### Health Check
|
|
|
|
```bash
|
|
# From any sandbox host
|
|
curl http://rosalind.incus:22084/api/ping
|
|
|
|
# Via HAProxy (external)
|
|
curl -k https://anythingllm.ouranos.helu.ca/api/ping
|
|
```
|
|
|
|
## Operations
|
|
|
|
### Start/Stop
|
|
|
|
```bash
|
|
# SSH to Rosalind
|
|
ssh rosalind.incus
|
|
|
|
# Manage via systemd
|
|
sudo systemctl start anythingllm-server # Start server
|
|
sudo systemctl start anythingllm-collector # Start collector
|
|
sudo systemctl stop anythingllm-server # Stop server
|
|
sudo systemctl stop anythingllm-collector # Stop collector
|
|
sudo systemctl restart anythingllm-server # Restart server
|
|
sudo systemctl restart anythingllm-collector # Restart collector
|
|
```
|
|
|
|
### Logs
|
|
|
|
```bash
|
|
# Real-time server logs
|
|
journalctl -u anythingllm-server -f
|
|
|
|
# Real-time collector logs
|
|
journalctl -u anythingllm-collector -f
|
|
|
|
# Grafana (historical)
|
|
# Query: {unit=~"anythingllm.*"}
|
|
```
|
|
|
|
### Upgrade
|
|
|
|
Pull latest code and redeploy:
|
|
|
|
```bash
|
|
ansible-playbook anythingllm/deploy.yml
|
|
```
|
|
|
|
## Vault Setup
|
|
|
|
Add the following secrets to `ansible/inventory/group_vars/all/vault.yml`:
|
|
|
|
```bash
|
|
ansible-vault edit ansible/inventory/group_vars/all/vault.yml
|
|
```
|
|
|
|
```yaml
|
|
# AnythingLLM Secrets
|
|
vault_anythingllm_db_password: "your-secure-password"
|
|
vault_anythingllm_jwt_secret: "your-32-char-jwt-secret"
|
|
vault_anythingllm_sig_key: "your-32-char-signature-key"
|
|
vault_anythingllm_sig_salt: "your-32-char-signature-salt"
|
|
```
|
|
|
|
## Follow-On Tasks
|
|
|
|
### MCP Server Integration
|
|
|
|
AnythingLLM supports Model Context Protocol (MCP) for extending AI agent capabilities. Future integration with existing MCP servers:
|
|
|
|
| MCP Server | Host | Tools |
|
|
|------------|------|-------|
|
|
| MCPO | Miranda | Docker management |
|
|
| Neo4j MCP | Miranda | Graph database queries |
|
|
| GitHub MCP | (external) | Repository operations |
|
|
|
|
Configure MCP connections via AnythingLLM Admin UI after initial deployment.
|
|
|
|
### Casdoor SSO
|
|
|
|
For single sign-on integration, configure AnythingLLM to authenticate via Casdoor OAuth2. This requires:
|
|
1. Creating an application in Casdoor admin
|
|
2. Configuring OAuth2 environment variables in AnythingLLM
|
|
3. Optionally using OAuth2-Proxy for transparent authentication
|
|
|
|
## Troubleshooting
|
|
|
|
### File Upload Fails with "File does not exist in upload directory"
|
|
|
|
**Symptom:** Uploading files via the UI returns 500 Internal Server Error with message "File does not exist in upload directory."
|
|
|
|
**Cause:** The server uploads files to `/srv/collector/hotdir`, but the collector looks for them in `/srv/anythingllm/app/collector/hotdir`. If these aren't the same physical directory, uploads fail.
|
|
|
|
**Solution:** Verify symlinks are correctly configured:
|
|
|
|
```bash
|
|
# Check symlinks
|
|
ls -la /srv/collector/hotdir
|
|
# Should show: /srv/collector/hotdir -> /srv/anythingllm/hotdir
|
|
|
|
ls -la /srv/anythingllm/app/collector/hotdir
|
|
# Should show: /srv/anythingllm/app/collector/hotdir -> /srv/anythingllm/hotdir
|
|
|
|
# If collector/hotdir is a directory (not symlink), fix it:
|
|
sudo rm -rf /srv/anythingllm/app/collector/hotdir
|
|
sudo ln -s /srv/anythingllm/hotdir /srv/anythingllm/app/collector/hotdir
|
|
sudo chown -h anythingllm:anythingllm /srv/anythingllm/app/collector/hotdir
|
|
sudo systemctl restart anythingllm-collector
|
|
```
|
|
|
|
### Container Won't Start
|
|
|
|
Check Docker logs:
|
|
```bash
|
|
sudo docker logs anythingllm
|
|
```
|
|
|
|
Verify PostgreSQL connectivity:
|
|
```bash
|
|
psql -h portia.incus -U anythingllm -d anythingllm
|
|
```
|
|
|
|
### Database Connection Issues
|
|
|
|
Ensure pgvector extension is enabled:
|
|
```bash
|
|
psql -h portia.incus -U postgres -d anythingllm -c "SELECT * FROM pg_extension WHERE extname = 'vector';"
|
|
```
|
|
|
|
### LLM Provider Issues
|
|
|
|
Test LLM endpoint directly:
|
|
```bash
|
|
curl http://pan.helu.ca:22071/v1/models
|
|
```
|