Files
ouranos/docs/anythingllm_overview.md
Robert Helewka b4d60f2f38 docs: rewrite README with structured overview and quick start guide
Replaces the minimal project description with a comprehensive README
including a component overview table, quick start instructions, common
Ansible operations, and links to detailed documentation. Aligns with
Red Panda Approval™ standards.
2026-03-03 12:49:06 +00:00

29 KiB

AnythingLLM: Your AI-Powered Knowledge Hub

🎯 What is AnythingLLM?

AnythingLLM is a full-stack application that transforms how you interact with Large Language Models (LLMs). Think of it as your personal AI assistant platform that can:

  • 💬 Chat with multiple LLM providers
  • 📚 Query your own documents and data (RAG - Retrieval Augmented Generation)
  • 🤖 Run autonomous AI agents with tools
  • 🔌 Extend capabilities via Model Context Protocol (MCP)
  • 👥 Support multiple users and workspaces
  • 🎨 Provide a beautiful, intuitive web interface

In simple terms: It's like ChatGPT, but you control everything - the data, the models, the privacy, and the capabilities.


🌟 Key Capabilities

1. Multi-Provider LLM Support

AnythingLLM isn't locked to a single AI provider. It supports 30+ LLM providers:

Your Environment:

┌─────────────────────────────────────────┐
│  Your LLM Infrastructure                │
├─────────────────────────────────────────┤
│  ✅ Llama CPP Router (pan.helu.ca)      │
│     - Load-balanced inference           │
│     - High availability                 │
│                                         │
│  ✅ Direct Llama CPP (nyx.helu.ca)      │
│     - Direct connection option          │
│     - Lower latency                     │
│                                         │
│  ✅ LLM Proxy - Arke (circe.helu.ca)    │
│     - Unified API gateway               │
│     - Request routing                   │
│                                         │
│  ✅ AWS Bedrock (optional)              │
│     - Claude, Titan models              │
│     - Enterprise-grade                  │
└─────────────────────────────────────────┘

What this means:

  • Switch between providers without changing your application
  • Use different models for different workspaces
  • Fallback to alternative providers if one fails
  • Compare model performance side-by-side

2. Document Intelligence (RAG)

AnythingLLM can ingest and understand your documents:

Supported Formats:

  • 📄 PDF, DOCX, TXT, MD
  • 🌐 Websites (scraping)
  • 📊 CSV, JSON
  • 🎥 YouTube transcripts
  • 🔗 GitHub repositories
  • 📝 Confluence, Notion exports

How it works:

Your Document → Text Extraction → Chunking → Embeddings → Vector DB (PostgreSQL)
                                                                    ↓
User Question → Embedding → Similarity Search → Relevant Chunks → LLM → Answer

Example Use Case:

You: "What's our refund policy?"
AnythingLLM: [Searches your policy documents]
             "According to your Terms of Service (page 12),
              refunds are available within 30 days..."

3. AI Agents with Tools 🤖

This is where AnythingLLM becomes truly powerful. Agents can:

Built-in Agent Tools:

  • 🌐 Web Browsing - Navigate websites, fill forms, take screenshots
  • 🔍 Web Scraping - Extract data from web pages
  • 📊 SQL Agent - Query databases (PostgreSQL, MySQL, MSSQL)
  • 📈 Chart Generation - Create visualizations
  • 💾 File Operations - Save and manage files
  • 📝 Document Summarization - Condense long documents
  • 🧠 Memory - Remember context across conversations

Agent Workflow Example:

User: "Check our database for users who signed up last week 
       and send them a welcome email"

Agent:
  1. Uses SQL Agent to query PostgreSQL
  2. Retrieves user list
  3. Generates personalized email content
  4. (With email MCP) Sends emails
  5. Reports back with results

4. Model Context Protocol (MCP) 🔌

MCP is AnythingLLM's superpower - it allows you to extend the AI with custom tools and data sources.

What is MCP?

MCP is a standardized protocol for connecting AI systems to external tools and data. Think of it as "plugins for AI."

Your MCP Possibilities:

Example 1: Docker Management

// MCP Server: docker-mcp
Tools Available:
  - list_containers()
  - start_container(name)
  - stop_container(name)
  - view_logs(container)
  - exec_command(container, command)

User: "Show me all running containers and restart the one using most memory"
Agent: [Uses docker-mcp tools to check, analyze, and restart]

Example 2: GitHub Integration

// MCP Server: github-mcp
Tools Available:
  - create_issue(repo, title, body)
  - search_code(query)
  - create_pr(repo, branch, title)
  - list_repos()

User: "Create a GitHub issue for the bug I just described"
Agent: [Uses github-mcp to create issue with details]

Example 3: Custom Business Tools

// Your Custom MCP Server
Tools Available:
  - query_crm(customer_id)
  - check_inventory(product_sku)
  - create_order(customer, items)
  - send_notification(user, message)

User: "Check if we have product XYZ in stock and notify me if it's low"
Agent: [Uses your custom MCP tools]

MCP Architecture in AnythingLLM:

┌─────────────────────────────────────────────────────────┐
│  AnythingLLM                                            │
│  ┌─────────────────────────────────────────────────┐   │
│  │  Agent System                                    │   │
│  │  ┌──────────┐  ┌──────────┐  ┌──────────┐      │   │
│  │  │ Built-in │  │   MCP    │  │  Custom  │      │   │
│  │  │  Tools   │  │  Tools   │  │  Flows   │      │   │
│  │  └──────────┘  └──────────┘  └──────────┘      │   │
│  └─────────────────────────────────────────────────┘   │
│                          ↓                              │
│  ┌─────────────────────────────────────────────────┐   │
│  │  MCP Hypervisor                                  │   │
│  │  - Manages MCP server lifecycle                  │   │
│  │  - Handles stdio/http/sse transports             │   │
│  │  - Auto-discovers tools                          │   │
│  └─────────────────────────────────────────────────┘   │
└─────────────────────────────────────────────────────────┘
                          ↓
┌─────────────────────────────────────────────────────────┐
│  MCP Servers (Running Locally or Remote)                │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐             │
│  │  Docker  │  │  GitHub  │  │  Custom  │             │
│  │   MCP    │  │   MCP    │  │   MCP    │             │
│  └──────────┘  └──────────┘  └──────────┘             │
└─────────────────────────────────────────────────────────┘

Key Features:

  • Hot-reload - Add/remove MCP servers without restarting
  • Multiple transports - stdio, HTTP, Server-Sent Events
  • Auto-discovery - Tools automatically appear in agent
  • Process management - Automatic start/stop/restart
  • Error handling - Graceful failures with logging

5. Agent Flows 🔄

Create no-code agent workflows for complex tasks:

┌─────────────────────────────────────────┐
│  Example Flow: "Daily Report Generator" │
├─────────────────────────────────────────┤
│  1. Query database for yesterday's data │
│  2. Generate summary statistics         │
│  3. Create visualization charts         │
│  4. Write report to document            │
│  5. Send via email (MCP)                │
└─────────────────────────────────────────┘

Flows can be:

  • Triggered manually
  • Scheduled (via external cron)
  • Called from other agents
  • Shared across workspaces

🏗️ How AnythingLLM Fits Your Environment

Your Complete Stack:

┌─────────────────────────────────────────────────────────────────┐
│  Internet                                                        │
└────────────────────────────┬────────────────────────────────────┘
                             ↓
┌─────────────────────────────────────────────────────────────────┐
│  HAProxy (SSL Termination & Load Balancing)                     │
│  - HTTPS/WSS support                                            │
│  - Security headers                                             │
│  - Health checks                                                │
└────────────────────────────┬────────────────────────────────────┘
                             ↓
┌─────────────────────────────────────────────────────────────────┐
│  AnythingLLM Application                                        │
│  ┌─────────────────┐  ┌─────────────────┐  ┌────────────────┐ │
│  │   Web UI        │  │   API Server    │  │  Agent Engine  │ │
│  │   - React       │  │   - Express.js  │  │  - AIbitat     │ │
│  │   - WebSocket   │  │   - REST API    │  │  - MCP Support │ │
│  └─────────────────┘  └─────────────────┘  └────────────────┘ │
└────────────────────────────┬────────────────────────────────────┘
                             ↓
┌─────────────────────────────────────────────────────────────────┐
│  Data Layer                                                      │
│  ┌──────────────────────────────────────────────────────────┐  │
│  │  PostgreSQL 17 + pgvector                                 │  │
│  │  - User data & workspaces                                 │  │
│  │  - Chat history                                           │  │
│  │  - Vector embeddings (for RAG)                            │  │
│  │  - Agent invocations                                      │  │
│  └──────────────────────────────────────────────────────────┘  │
└─────────────────────────────────────────────────────────────────┘
                             ↓
┌─────────────────────────────────────────────────────────────────┐
│  External LLM Services                                          │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐         │
│  │ Llama Router │  │ Direct Llama │  │  LLM Proxy   │         │
│  │ pan.helu.ca  │  │ nyx.helu.ca  │  │ circe.helu.ca│         │
│  └──────────────┘  └──────────────┘  └──────────────┘         │
└─────────────────────────────────────────────────────────────────┘
                             ↓
┌─────────────────────────────────────────────────────────────────┐
│  TTS Service                                                     │
│  ┌──────────────────────────────────────────────────────────┐  │
│  │  FastKokoro (OpenAI-compatible TTS)                       │  │
│  │  pan.helu.ca:22070                                        │  │
│  │  - Text-to-speech generation                              │  │
│  │  - Multiple voices                                        │  │
│  └──────────────────────────────────────────────────────────┘  │
└─────────────────────────────────────────────────────────────────┘

Observability Stack:

┌─────────────────────────────────────────────────────────────────┐
│  Monitoring & Logging                                           │
│  ┌──────────────────────────────────────────────────────────┐  │
│  │  Grafana (Unified Dashboard)                              │  │
│  │  - Metrics visualization                                  │  │
│  │  - Log exploration                                        │  │
│  │  - Alerting                                               │  │
│  └────────────┬─────────────────────────────┬────────────────┘  │
│               ↓                             ↓                   │
│  ┌────────────────────────┐   ┌────────────────────────┐       │
│  │  Prometheus            │   │  Loki                  │       │
│  │  - Metrics storage     │   │  - Log aggregation     │       │
│  │  - Alert rules         │   │  - 31-day retention    │       │
│  │  - 30-day retention    │   │  - Query language      │       │
│  └────────────────────────┘   └────────────────────────┘       │
│               ↑                             ↑                   │
│  ┌────────────┴─────────────────────────────┴────────────────┐ │
│  │  Data Collection                                           │ │
│  │  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐    │ │
│  │  │  cAdvisor    │  │ Postgres     │  │  Alloy       │    │ │
│  │  │  (Container) │  │ Exporter     │  │  (Logs)      │    │ │
│  │  └──────────────┘  └──────────────┘  └──────────────┘    │ │
│  └────────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘

🎨 Real-World Use Cases

Use Case 1: Internal Knowledge Base

Scenario: Your team needs quick access to company documentation

Setup:

  1. Upload all company docs to AnythingLLM workspace
  2. Documents are embedded and stored in PostgreSQL
  3. Team members ask questions naturally

Example:

Employee: "What's the process for requesting time off?"
AnythingLLM: [Searches HR documents]
             "According to the Employee Handbook, you need to:
              1. Submit request via HR portal
              2. Get manager approval
              3. Minimum 2 weeks notice for vacations..."

Benefits:

  • No more searching through SharePoint
  • Instant answers with source citations
  • Always up-to-date (re-sync documents)
  • Multi-user access with permissions

Use Case 2: DevOps Assistant

Scenario: Manage infrastructure with natural language

Setup:

  1. Install Docker MCP server
  2. Install GitHub MCP server
  3. Connect to your monitoring stack

Example Conversation:

You: "Show me all containers and their resource usage"
Agent: [Uses docker-mcp + Prometheus data]
       "Here are your containers:
        - anythingllm: 2.1GB RAM, 45% CPU
        - postgres: 1.8GB RAM, 12% CPU
        - prometheus: 1.2GB RAM, 8% CPU
        
        anythingllm is using high CPU. Would you like me to investigate?"

You: "Yes, check the logs for errors"
Agent: [Uses docker-mcp to fetch logs]
       "Found 15 errors in the last hour related to LLM timeouts.
        Should I create a GitHub issue?"

You: "Yes, and restart the container"
Agent: [Creates GitHub issue, restarts container]
       "Done! Issue #123 created and container restarted.
        CPU usage now at 15%."

Use Case 3: Customer Support Automation

Scenario: AI-powered support that can take action

Setup:

  1. Upload product documentation
  2. Connect CRM via custom MCP
  3. Enable SQL agent for database queries

Example:

Support Agent: "Customer John Doe says his order #12345 hasn't arrived"
AnythingLLM: [Queries database via SQL agent]
             "Order #12345 shipped on Jan 5th via FedEx.
              Tracking shows it's delayed due to weather.
              
              Would you like me to:
              1. Send customer an update email
              2. Offer expedited shipping on next order
              3. Issue a partial refund"

Support Agent: "Send update email"
AnythingLLM: [Uses email MCP]
             "Email sent to john@example.com with tracking info
              and apology for delay."

Use Case 4: Data Analysis Assistant

Scenario: Query your database with natural language

Setup:

  1. Enable SQL Agent
  2. Connect to PostgreSQL
  3. Grant read-only access

Example:

You: "Show me user signups by month for the last 6 months"
Agent: [Generates and executes SQL]
       SELECT 
         DATE_TRUNC('month', created_at) as month,
         COUNT(*) as signups
       FROM users
       WHERE created_at >= NOW() - INTERVAL '6 months'
       GROUP BY month
       ORDER BY month;
       
       Results:
       - July 2025: 145 signups
       - August 2025: 203 signups
       - September 2025: 187 signups
       ...

You: "Create a chart of this"
Agent: [Uses chart generation tool]
       [Displays bar chart visualization]

🔐 Security & Privacy

Why Self-Hosted Matters:

Your Data Stays Yours:

  • Documents never leave your infrastructure
  • Chat history stored in your PostgreSQL
  • No data sent to third parties (except chosen LLM provider)
  • Full audit trail in logs (via Loki)

Access Control:

  • Multi-user authentication
  • Role-based permissions (Admin, User)
  • Workspace-level isolation
  • API key management

Network Security:

  • HAProxy SSL termination
  • Security headers (HSTS, CSP, etc.)
  • Internal network isolation
  • Firewall-friendly (only ports 80/443 exposed)

Monitoring:

  • All access logged to Loki
  • Failed login attempts tracked
  • Resource usage monitored
  • Alerts for suspicious activity

📊 Monitoring Integration

Your observability stack provides complete visibility:

What You Can Monitor:

Application Health:

Grafana Dashboard: "AnythingLLM Overview"
├─ Request Rate: 1,234 req/min
├─ Response Time: 245ms avg
├─ Error Rate: 0.3%
├─ Active Users: 23
└─ Agent Invocations: 45/hour

Resource Usage:

Container Metrics (via cAdvisor):
├─ CPU: 45% (2 cores)
├─ Memory: 2.1GB / 4GB
├─ Network: 15MB/s in, 8MB/s out
└─ Disk I/O: 120 IOPS

Database Performance:

PostgreSQL Metrics (via postgres-exporter):
├─ Connections: 45 / 100
├─ Query Time: 12ms avg
├─ Cache Hit Ratio: 98.5%
├─ Database Size: 2.3GB
└─ Vector Index Size: 450MB

LLM Provider Performance:

Custom Metrics (via HAProxy):
├─ Llama Router: 234ms avg latency
├─ Direct Llama: 189ms avg latency
├─ Arke Proxy: 267ms avg latency
└─ Success Rate: 99.2%

Log Analysis (Loki):

# Find slow LLM responses
{service="anythingllm"} 
  | json 
  | duration > 5000

# Track agent tool usage
{service="anythingllm"} 
  |= "agent" 
  |= "tool_call"

# Monitor errors by type
{service="anythingllm"} 
  |= "ERROR" 
  | json 
  | count by error_type

Alerting Examples:

Critical Alerts:

  • 🚨 AnythingLLM container down
  • 🚨 PostgreSQL connection failures
  • 🚨 Disk space > 95%
  • 🚨 Memory usage > 90%

Warning Alerts:

  • ⚠️ High LLM response times (> 5s)
  • ⚠️ Database connections > 80%
  • ⚠️ Error rate > 1%
  • ⚠️ Agent failures

🚀 Getting Started

Quick Start:

cd deployment

# 1. Configure environment
cp .env.example .env
nano .env  # Set your LLM endpoints, passwords, etc.

# 2. Setup SSL certificates
# (See README.md for Let's Encrypt instructions)

# 3. Deploy
docker-compose up -d

# 4. Access services
# - AnythingLLM: https://your-domain.com
# - Grafana: http://localhost:3000
# - Prometheus: http://localhost:9090

First Steps in AnythingLLM:

  1. Create Account - First user becomes admin
  2. Create Workspace - Organize by project/team
  3. Upload Documents - Add your knowledge base
  4. Configure LLM - Choose your provider (already set via .env)
  5. Enable Agents - Turn on agent mode for tools
  6. Add MCP Servers - Extend with custom tools
  7. Start Chatting! - Ask questions, run agents

🎯 Why AnythingLLM is Powerful

Compared to ChatGPT:

Feature ChatGPT AnythingLLM
Data Privacy Data sent to OpenAI Self-hosted, private
Custom Documents ⚠️ Limited (ChatGPT Plus) Unlimited RAG
LLM Choice OpenAI only 30+ providers
Agents ⚠️ Limited tools Unlimited via MCP
Multi-User Individual accounts Team workspaces
API Access ⚠️ Paid tier Full REST API
Monitoring No visibility Complete observability
Cost 💰 $20/user/month Self-hosted (compute only)

Compared to LangChain/LlamaIndex:

Feature LangChain AnythingLLM
Setup 🔧 Code required Web UI, no code
User Interface Build your own Beautiful UI included
Multi-User Build your own Built-in
Agents Powerful Equally powerful + UI
MCP Support No Native support
Monitoring DIY Integrated
Learning Curve 📚 Steep Gentle

🎓 Advanced Capabilities

1. Workspace Isolation

Create separate workspaces for different use cases:

├─ Engineering Workspace
│  ├─ Documents: Code docs, API specs
│  ├─ LLM: Direct Llama (fast)
│  └─ Agents: GitHub MCP, Docker MCP
│
├─ Customer Support Workspace
│  ├─ Documents: Product docs, FAQs
│  ├─ LLM: Llama Router (reliable)
│  └─ Agents: CRM MCP, Email MCP
│
└─ Executive Workspace
   ├─ Documents: Reports, analytics
   ├─ LLM: AWS Bedrock Claude (best quality)
   └─ Agents: SQL Agent, Chart generation

2. Embedding Strategies

AnythingLLM supports multiple embedding models:

  • Native (Xenova) - Fast, runs locally
  • OpenAI - High quality, requires API
  • Azure OpenAI - Enterprise option
  • Local AI - Self-hosted alternative

Your Setup: Using native embeddings for privacy and speed

3. Agent Chaining

Agents can call other agents:

Main Agent
  ├─> Research Agent (web scraping)
  ├─> Analysis Agent (SQL queries)
  └─> Report Agent (document generation)

4. API Integration

Full REST API for programmatic access:

# Send chat message
curl -X POST https://your-domain.com/api/v1/workspace/chat \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{"message": "What is our refund policy?"}'

# Upload document
curl -X POST https://your-domain.com/api/v1/document/upload \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -F "file=@policy.pdf"

# Invoke agent
curl -X POST https://your-domain.com/api/v1/agent/invoke \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{"prompt": "Check server status"}'

🔮 Future Possibilities

With your infrastructure, you could:

1. Voice Interface

  • Use FastKokoro TTS for responses
  • Add speech-to-text (Whisper)
  • Create voice-controlled assistant

2. Slack/Discord Bot

  • Create MCP server for messaging
  • Deploy bot that uses AnythingLLM
  • Team can chat with AI in Slack

3. Automated Workflows

  • Scheduled agent runs (cron)
  • Webhook triggers
  • Event-driven automation

4. Custom Dashboards

  • Embed AnythingLLM in your apps
  • White-label the interface
  • Custom branding

5. Multi-Modal AI

  • Image analysis (with vision models)
  • Document OCR
  • Video transcription

📚 Summary

AnythingLLM is your AI platform that:

Respects Privacy - Self-hosted, your data stays yours Flexible - 30+ LLM providers, switch anytime Intelligent - RAG for document understanding Powerful - AI agents with unlimited tools via MCP Observable - Full monitoring with Prometheus/Loki Scalable - PostgreSQL + HAProxy for production Extensible - MCP protocol for custom integrations User-Friendly - Beautiful web UI, no coding required

In your environment, it provides:

🎯 Unified AI Interface - One place for all AI interactions 🔧 DevOps Automation - Manage infrastructure with natural language 📊 Data Intelligence - Query databases, analyze trends 🤖 Autonomous Agents - Tasks that run themselves 📈 Complete Visibility - Every metric, every log, every alert 🔒 Enterprise Security - SSL, auth, audit trails, monitoring

Think of it as: Your personal AI assistant platform that can see your data, use your tools, and help your team - all while you maintain complete control.


🆘 Learn More