ouranos/docs/anythingllm_overview.md

# AnythingLLM: Your AI-Powered Knowledge Hub

## 🎯 What is AnythingLLM?

AnythingLLM is a **full-stack application** that transforms how you interact with Large Language Models (LLMs). Think of it as your personal AI assistant platform that can:

- 💬 Chat with multiple LLM providers
- 📚 Query your own documents and data (RAG - Retrieval Augmented Generation)
- 🤖 Run autonomous AI agents with tools
- 🔌 Extend capabilities via Model Context Protocol (MCP)
- 👥 Support multiple users and workspaces
- 🎨 Provide a beautiful, intuitive web interface

**In simple terms:** It's like ChatGPT, but you control everything - the data, the models, the privacy, and the capabilities.

---

## 🌟 Key Capabilities

### 1. **Multi-Provider LLM Support**

AnythingLLM isn't locked to a single AI provider. It supports **30+ LLM providers**:

#### Your Environment:
```
┌─────────────────────────────────────────┐
│  Your LLM Infrastructure                │
├─────────────────────────────────────────┤
│  ✅ Llama CPP Router (pan.helu.ca)      │
│     - Load-balanced inference           │
│     - High availability                 │
│                                         │
│  ✅ Direct Llama CPP (nyx.helu.ca)      │
│     - Direct connection option          │
│     - Lower latency                     │
│                                         │
│  ✅ LLM Proxy - Arke (circe.helu.ca)    │
│     - Unified API gateway               │
│     - Request routing                   │
│                                         │
│  ✅ AWS Bedrock (optional)              │
│     - Claude, Titan models              │
│     - Enterprise-grade                  │
└─────────────────────────────────────────┘
```

**What this means:**
- Switch between providers without changing your application
- Use different models for different workspaces
- Fallback to alternative providers if one fails
- Compare model performance side-by-side

### 2. **Document Intelligence (RAG)**

AnythingLLM can ingest and understand your documents:

**Supported Formats:**
- 📄 PDF, DOCX, TXT, MD
- 🌐 Websites (scraping)
- 📊 CSV, JSON
- 🎥 YouTube transcripts
- 🔗 GitHub repositories
- 📝 Confluence, Notion exports

**How it works:**
```
Your Document → Text Extraction → Chunking → Embeddings → Vector DB (PostgreSQL)
                                                                    ↓
User Question → Embedding → Similarity Search → Relevant Chunks → LLM → Answer
```

**Example Use Case:**
```
You: "What's our refund policy?"
AnythingLLM: [Searches your policy documents]
             "According to your Terms of Service (page 12),
              refunds are available within 30 days..."
```

### 3. **AI Agents with Tools** 🤖

This is where AnythingLLM becomes **truly powerful**. Agents can:

#### Built-in Agent Tools:
- 🌐 **Web Browsing** - Navigate websites, fill forms, take screenshots
- 🔍 **Web Scraping** - Extract data from web pages
- 📊 **SQL Agent** - Query databases (PostgreSQL, MySQL, MSSQL)
- 📈 **Chart Generation** - Create visualizations
- 💾 **File Operations** - Save and manage files
- 📝 **Document Summarization** - Condense long documents
- 🧠 **Memory** - Remember context across conversations

#### Agent Workflow Example:
```
User: "Check our database for users who signed up last week
       and send them a welcome email"

Agent:
  1. Uses SQL Agent to query PostgreSQL
  2. Retrieves user list
  3. Generates personalized email content
  4. (With email MCP) Sends emails
  5. Reports back with results
```

### 4. **Model Context Protocol (MCP)** 🔌

MCP is AnythingLLM's **superpower** - it allows you to extend the AI with custom tools and data sources.

#### What is MCP?

MCP is a **standardized protocol** for connecting AI systems to external tools and data. Think of it as "plugins for AI."

#### Your MCP Possibilities:

**Example 1: Docker Management**
```javascript
// MCP Server: docker-mcp
Tools Available:
  - list_containers()
  - start_container(name)
  - stop_container(name)
  - view_logs(container)
  - exec_command(container, command)

User: "Show me all running containers and restart the one using most memory"
Agent: [Uses docker-mcp tools to check, analyze, and restart]
```

**Example 2: GitHub Integration**
```javascript
// MCP Server: github-mcp
Tools Available:
  - create_issue(repo, title, body)
  - search_code(query)
  - create_pr(repo, branch, title)
  - list_repos()

User: "Create a GitHub issue for the bug I just described"
Agent: [Uses github-mcp to create issue with details]
```

**Example 3: Custom Business Tools**
```javascript
// Your Custom MCP Server
Tools Available:
  - query_crm(customer_id)
  - check_inventory(product_sku)
  - create_order(customer, items)
  - send_notification(user, message)

User: "Check if we have product XYZ in stock and notify me if it's low"
Agent: [Uses your custom MCP tools]
```

#### MCP Architecture in AnythingLLM:

```
┌─────────────────────────────────────────────────────────┐
│  AnythingLLM                                            │
│  ┌─────────────────────────────────────────────────┐   │
│  │  Agent System                                    │   │
│  │  ┌──────────┐  ┌──────────┐  ┌──────────┐      │   │
│  │  │ Built-in │  │   MCP    │  │  Custom  │      │   │
│  │  │  Tools   │  │  Tools   │  │  Flows   │      │   │
│  │  └──────────┘  └──────────┘  └──────────┘      │   │
│  └─────────────────────────────────────────────────┘   │
│                          ↓                              │
│  ┌─────────────────────────────────────────────────┐   │
│  │  MCP Hypervisor                                  │   │
│  │  - Manages MCP server lifecycle                  │   │
│  │  - Handles stdio/http/sse transports             │   │
│  │  - Auto-discovers tools                          │   │
│  └─────────────────────────────────────────────────┘   │
└─────────────────────────────────────────────────────────┘
                          ↓
┌─────────────────────────────────────────────────────────┐
│  MCP Servers (Running Locally or Remote)                │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐             │
│  │  Docker  │  │  GitHub  │  │  Custom  │             │
│  │   MCP    │  │   MCP    │  │   MCP    │             │
│  └──────────┘  └──────────┘  └──────────┘             │
└─────────────────────────────────────────────────────────┘
```

**Key Features:**
- ✅ **Hot-reload** - Add/remove MCP servers without restarting
- ✅ **Multiple transports** - stdio, HTTP, Server-Sent Events
- ✅ **Auto-discovery** - Tools automatically appear in agent
- ✅ **Process management** - Automatic start/stop/restart
- ✅ **Error handling** - Graceful failures with logging

### 5. **Agent Flows** 🔄

Create **no-code agent workflows** for complex tasks:

```
┌─────────────────────────────────────────┐
│  Example Flow: "Daily Report Generator" │
├─────────────────────────────────────────┤
│  1. Query database for yesterday's data │
│  2. Generate summary statistics         │
│  3. Create visualization charts         │
│  4. Write report to document            │
│  5. Send via email (MCP)                │
└─────────────────────────────────────────┘
```

Flows can be:
- Triggered manually
- Scheduled (via external cron)
- Called from other agents
- Shared across workspaces

---

## 🏗️ How AnythingLLM Fits Your Environment

### Your Complete Stack:

```
┌─────────────────────────────────────────────────────────────────┐
│  Internet                                                        │
└────────────────────────────┬────────────────────────────────────┘
                             ↓
┌─────────────────────────────────────────────────────────────────┐
│  HAProxy (SSL Termination & Load Balancing)                     │
│  - HTTPS/WSS support                                            │
│  - Security headers                                             │
│  - Health checks                                                │
└────────────────────────────┬────────────────────────────────────┘
                             ↓
┌─────────────────────────────────────────────────────────────────┐
│  AnythingLLM Application                                        │
│  ┌─────────────────┐  ┌─────────────────┐  ┌────────────────┐ │
│  │   Web UI        │  │   API Server    │  │  Agent Engine  │ │
│  │   - React       │  │   - Express.js  │  │  - AIbitat     │ │
│  │   - WebSocket   │  │   - REST API    │  │  - MCP Support │ │
│  └─────────────────┘  └─────────────────┘  └────────────────┘ │
└────────────────────────────┬────────────────────────────────────┘
                             ↓
┌─────────────────────────────────────────────────────────────────┐
│  Data Layer                                                      │
│  ┌──────────────────────────────────────────────────────────┐  │
│  │  PostgreSQL 17 + pgvector                                 │  │
│  │  - User data & workspaces                                 │  │
│  │  - Chat history                                           │  │
│  │  - Vector embeddings (for RAG)                            │  │
│  │  - Agent invocations                                      │  │
│  └──────────────────────────────────────────────────────────┘  │
└─────────────────────────────────────────────────────────────────┘
                             ↓
┌─────────────────────────────────────────────────────────────────┐
│  External LLM Services                                          │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐         │
│  │ Llama Router │  │ Direct Llama │  │  LLM Proxy   │         │
│  │ pan.helu.ca  │  │ nyx.helu.ca  │  │ circe.helu.ca│         │
│  └──────────────┘  └──────────────┘  └──────────────┘         │
└─────────────────────────────────────────────────────────────────┘
                             ↓
┌─────────────────────────────────────────────────────────────────┐
│  TTS Service                                                     │
│  ┌──────────────────────────────────────────────────────────┐  │
│  │  FastKokoro (OpenAI-compatible TTS)                       │  │
│  │  pan.helu.ca:22070                                        │  │
│  │  - Text-to-speech generation                              │  │
│  │  - Multiple voices                                        │  │
│  └──────────────────────────────────────────────────────────┘  │
└─────────────────────────────────────────────────────────────────┘
```

### Observability Stack:

```
┌─────────────────────────────────────────────────────────────────┐
│  Monitoring & Logging                                           │
│  ┌──────────────────────────────────────────────────────────┐  │
│  │  Grafana (Unified Dashboard)                              │  │
│  │  - Metrics visualization                                  │  │
│  │  - Log exploration                                        │  │
│  │  - Alerting                                               │  │
│  └────────────┬─────────────────────────────┬────────────────┘  │
│               ↓                             ↓                   │
│  ┌────────────────────────┐   ┌────────────────────────┐       │
│  │  Prometheus            │   │  Loki                  │       │
│  │  - Metrics storage     │   │  - Log aggregation     │       │
│  │  - Alert rules         │   │  - 31-day retention    │       │
│  │  - 30-day retention    │   │  - Query language      │       │
│  └────────────────────────┘   └────────────────────────┘       │
│               ↑                             ↑                   │
│  ┌────────────┴─────────────────────────────┴────────────────┐ │
│  │  Data Collection                                           │ │
│  │  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐    │ │
│  │  │  cAdvisor    │  │ Postgres     │  │  Alloy       │    │ │
│  │  │  (Container) │  │ Exporter     │  │  (Logs)      │    │ │
│  │  └──────────────┘  └──────────────┘  └──────────────┘    │ │
│  └────────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
```

---

## 🎨 Real-World Use Cases

### Use Case 1: **Internal Knowledge Base**

**Scenario:** Your team needs quick access to company documentation

**Setup:**
1. Upload all company docs to AnythingLLM workspace
2. Documents are embedded and stored in PostgreSQL
3. Team members ask questions naturally

**Example:**
```
Employee: "What's the process for requesting time off?"
AnythingLLM: [Searches HR documents]
             "According to the Employee Handbook, you need to:
              1. Submit request via HR portal
              2. Get manager approval
              3. Minimum 2 weeks notice for vacations..."
```

**Benefits:**
- ✅ No more searching through SharePoint
- ✅ Instant answers with source citations
- ✅ Always up-to-date (re-sync documents)
- ✅ Multi-user access with permissions

### Use Case 2: **DevOps Assistant**

**Scenario:** Manage infrastructure with natural language

**Setup:**
1. Install Docker MCP server
2. Install GitHub MCP server
3. Connect to your monitoring stack

**Example Conversation:**
```
You: "Show me all containers and their resource usage"
Agent: [Uses docker-mcp + Prometheus data]
       "Here are your containers:
        - anythingllm: 2.1GB RAM, 45% CPU
        - postgres: 1.8GB RAM, 12% CPU
        - prometheus: 1.2GB RAM, 8% CPU

        anythingllm is using high CPU. Would you like me to investigate?"

You: "Yes, check the logs for errors"
Agent: [Uses docker-mcp to fetch logs]
       "Found 15 errors in the last hour related to LLM timeouts.
        Should I create a GitHub issue?"

You: "Yes, and restart the container"
Agent: [Creates GitHub issue, restarts container]
       "Done! Issue #123 created and container restarted.
        CPU usage now at 15%."
```

### Use Case 3: **Customer Support Automation**

**Scenario:** AI-powered support that can take action

**Setup:**
1. Upload product documentation
2. Connect CRM via custom MCP
3. Enable SQL agent for database queries

**Example:**
```
Support Agent: "Customer John Doe says his order #12345 hasn't arrived"
AnythingLLM: [Queries database via SQL agent]
             "Order #12345 shipped on Jan 5th via FedEx.
              Tracking shows it's delayed due to weather.

              Would you like me to:
              1. Send customer an update email
              2. Offer expedited shipping on next order
              3. Issue a partial refund"

Support Agent: "Send update email"
AnythingLLM: [Uses email MCP]
             "Email sent to john@example.com with tracking info
              and apology for delay."
```

### Use Case 4: **Data Analysis Assistant**

**Scenario:** Query your database with natural language

**Setup:**
1. Enable SQL Agent
2. Connect to PostgreSQL
3. Grant read-only access

**Example:**
```
You: "Show me user signups by month for the last 6 months"
Agent: [Generates and executes SQL]
       SELECT
         DATE_TRUNC('month', created_at) as month,
         COUNT(*) as signups
       FROM users
       WHERE created_at >= NOW() - INTERVAL '6 months'
       GROUP BY month
       ORDER BY month;

       Results:
       - July 2025: 145 signups
       - August 2025: 203 signups
       - September 2025: 187 signups
       ...

You: "Create a chart of this"
Agent: [Uses chart generation tool]
       [Displays bar chart visualization]
```

---

## 🔐 Security & Privacy

### Why Self-Hosted Matters:

**Your Data Stays Yours:**
- ✅ Documents never leave your infrastructure
- ✅ Chat history stored in your PostgreSQL
- ✅ No data sent to third parties (except chosen LLM provider)
- ✅ Full audit trail in logs (via Loki)

**Access Control:**
- ✅ Multi-user authentication
- ✅ Role-based permissions (Admin, User)
- ✅ Workspace-level isolation
- ✅ API key management

**Network Security:**
- ✅ HAProxy SSL termination
- ✅ Security headers (HSTS, CSP, etc.)
- ✅ Internal network isolation
- ✅ Firewall-friendly (only ports 80/443 exposed)

**Monitoring:**
- ✅ All access logged to Loki
- ✅ Failed login attempts tracked
- ✅ Resource usage monitored
- ✅ Alerts for suspicious activity

---

## 📊 Monitoring Integration

Your observability stack provides **complete visibility**:

### What You Can Monitor:

**Application Health:**
```
Grafana Dashboard: "AnythingLLM Overview"
├─ Request Rate: 1,234 req/min
├─ Response Time: 245ms avg
├─ Error Rate: 0.3%
├─ Active Users: 23
└─ Agent Invocations: 45/hour
```

**Resource Usage:**
```
Container Metrics (via cAdvisor):
├─ CPU: 45% (2 cores)
├─ Memory: 2.1GB / 4GB
├─ Network: 15MB/s in, 8MB/s out
└─ Disk I/O: 120 IOPS
```

**Database Performance:**
```
PostgreSQL Metrics (via postgres-exporter):
├─ Connections: 45 / 100
├─ Query Time: 12ms avg
├─ Cache Hit Ratio: 98.5%
├─ Database Size: 2.3GB
└─ Vector Index Size: 450MB
```

**LLM Provider Performance:**
```
Custom Metrics (via HAProxy):
├─ Llama Router: 234ms avg latency
├─ Direct Llama: 189ms avg latency
├─ Arke Proxy: 267ms avg latency
└─ Success Rate: 99.2%
```

**Log Analysis (Loki):**
```logql
# Find slow LLM responses
{service="anythingllm"}
  | json
  | duration > 5000

# Track agent tool usage
{service="anythingllm"}
  |= "agent"
  |= "tool_call"

# Monitor errors by type
{service="anythingllm"}
  |= "ERROR"
  | json
  | count by error_type
```

### Alerting Examples:

**Critical Alerts:**
- 🚨 AnythingLLM container down
- 🚨 PostgreSQL connection failures
- 🚨 Disk space > 95%
- 🚨 Memory usage > 90%

**Warning Alerts:**
- ⚠️ High LLM response times (> 5s)
- ⚠️ Database connections > 80%
- ⚠️ Error rate > 1%
- ⚠️ Agent failures

---

## 🚀 Getting Started

### Quick Start:

```bash
cd deployment

# 1. Configure environment
cp .env.example .env
nano .env  # Set your LLM endpoints, passwords, etc.

# 2. Setup SSL certificates
# (See README.md for Let's Encrypt instructions)

# 3. Deploy
docker-compose up -d

# 4. Access services
# - AnythingLLM: https://your-domain.com
# - Grafana: http://localhost:3000
# - Prometheus: http://localhost:9090
```

### First Steps in AnythingLLM:

1. **Create Account** - First user becomes admin
2. **Create Workspace** - Organize by project/team
3. **Upload Documents** - Add your knowledge base
4. **Configure LLM** - Choose your provider (already set via .env)
5. **Enable Agents** - Turn on agent mode for tools
6. **Add MCP Servers** - Extend with custom tools
7. **Start Chatting!** - Ask questions, run agents

---

## 🎯 Why AnythingLLM is Powerful

### Compared to ChatGPT:

| Feature | ChatGPT | AnythingLLM |
|---------|---------|-------------|
| **Data Privacy** | ❌ Data sent to OpenAI | ✅ Self-hosted, private |
| **Custom Documents** | ⚠️ Limited (ChatGPT Plus) | ✅ Unlimited RAG |
| **LLM Choice** | ❌ OpenAI only | ✅ 30+ providers |
| **Agents** | ⚠️ Limited tools | ✅ Unlimited via MCP |
| **Multi-User** | ❌ Individual accounts | ✅ Team workspaces |
| **API Access** | ⚠️ Paid tier | ✅ Full REST API |
| **Monitoring** | ❌ No visibility | ✅ Complete observability |
| **Cost** | 💰 $20/user/month | ✅ Self-hosted (compute only) |

### Compared to LangChain/LlamaIndex:

| Feature | LangChain | AnythingLLM |
|---------|-----------|-------------|
| **Setup** | 🔧 Code required | ✅ Web UI, no code |
| **User Interface** | ❌ Build your own | ✅ Beautiful UI included |
| **Multi-User** | ❌ Build your own | ✅ Built-in |
| **Agents** | ✅ Powerful | ✅ Equally powerful + UI |
| **MCP Support** | ❌ No | ✅ Native support |
| **Monitoring** | ❌ DIY | ✅ Integrated |
| **Learning Curve** | 📚 Steep | ✅ Gentle |

---

## 🎓 Advanced Capabilities

### 1. **Workspace Isolation**

Create separate workspaces for different use cases:

```
├─ Engineering Workspace
│  ├─ Documents: Code docs, API specs
│  ├─ LLM: Direct Llama (fast)
│  └─ Agents: GitHub MCP, Docker MCP
│
├─ Customer Support Workspace
│  ├─ Documents: Product docs, FAQs
│  ├─ LLM: Llama Router (reliable)
│  └─ Agents: CRM MCP, Email MCP
│
└─ Executive Workspace
   ├─ Documents: Reports, analytics
   ├─ LLM: AWS Bedrock Claude (best quality)
   └─ Agents: SQL Agent, Chart generation
```

### 2. **Embedding Strategies**

AnythingLLM supports multiple embedding models:

- **Native** (Xenova) - Fast, runs locally
- **OpenAI** - High quality, requires API
- **Azure OpenAI** - Enterprise option
- **Local AI** - Self-hosted alternative

**Your Setup:** Using native embeddings for privacy and speed

### 3. **Agent Chaining**

Agents can call other agents:

```
Main Agent
  ├─> Research Agent (web scraping)
  ├─> Analysis Agent (SQL queries)
  └─> Report Agent (document generation)
```

### 4. **API Integration**

Full REST API for programmatic access:

```bash
# Send chat message
curl -X POST https://your-domain.com/api/v1/workspace/chat \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{"message": "What is our refund policy?"}'

# Upload document
curl -X POST https://your-domain.com/api/v1/document/upload \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -F "file=@policy.pdf"

# Invoke agent
curl -X POST https://your-domain.com/api/v1/agent/invoke \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{"prompt": "Check server status"}'
```

---

## 🔮 Future Possibilities

With your infrastructure, you could:

### 1. **Voice Interface**
- Use FastKokoro TTS for responses
- Add speech-to-text (Whisper)
- Create voice-controlled assistant

### 2. **Slack/Discord Bot**
- Create MCP server for messaging
- Deploy bot that uses AnythingLLM
- Team can chat with AI in Slack

### 3. **Automated Workflows**
- Scheduled agent runs (cron)
- Webhook triggers
- Event-driven automation

### 4. **Custom Dashboards**
- Embed AnythingLLM in your apps
- White-label the interface
- Custom branding

### 5. **Multi-Modal AI**
- Image analysis (with vision models)
- Document OCR
- Video transcription

---

## 📚 Summary

**AnythingLLM is your AI platform that:**

✅ **Respects Privacy** - Self-hosted, your data stays yours
✅ **Flexible** - 30+ LLM providers, switch anytime
✅ **Intelligent** - RAG for document understanding
✅ **Powerful** - AI agents with unlimited tools via MCP
✅ **Observable** - Full monitoring with Prometheus/Loki
✅ **Scalable** - PostgreSQL + HAProxy for production
✅ **Extensible** - MCP protocol for custom integrations
✅ **User-Friendly** - Beautiful web UI, no coding required

**In your environment, it provides:**

🎯 **Unified AI Interface** - One place for all AI interactions
🔧 **DevOps Automation** - Manage infrastructure with natural language
📊 **Data Intelligence** - Query databases, analyze trends
🤖 **Autonomous Agents** - Tasks that run themselves
📈 **Complete Visibility** - Every metric, every log, every alert
🔒 **Enterprise Security** - SSL, auth, audit trails, monitoring

**Think of it as:** Your personal AI assistant platform that can see your data, use your tools, and help your team - all while you maintain complete control.

---

## 🆘 Learn More

- **Deployment Guide**: [README.md](README.md)
- **Monitoring Explained**: [PROMETHEUS_EXPLAINED.md](PROMETHEUS_EXPLAINED.md)
- **Official Docs**: https://docs.anythingllm.com
- **GitHub**: https://github.com/Mintplex-Labs/anything-llm
- **Discord Community**: https://discord.gg/6UyHPeGZAC