Replaces the minimal project description with a comprehensive README including a component overview table, quick start instructions, common Ansible operations, and links to detailed documentation. Aligns with Red Panda Approval™ standards.
29 KiB
AnythingLLM: Your AI-Powered Knowledge Hub
🎯 What is AnythingLLM?
AnythingLLM is a full-stack application that transforms how you interact with Large Language Models (LLMs). Think of it as your personal AI assistant platform that can:
- 💬 Chat with multiple LLM providers
- 📚 Query your own documents and data (RAG - Retrieval Augmented Generation)
- 🤖 Run autonomous AI agents with tools
- 🔌 Extend capabilities via Model Context Protocol (MCP)
- 👥 Support multiple users and workspaces
- 🎨 Provide a beautiful, intuitive web interface
In simple terms: It's like ChatGPT, but you control everything - the data, the models, the privacy, and the capabilities.
🌟 Key Capabilities
1. Multi-Provider LLM Support
AnythingLLM isn't locked to a single AI provider. It supports 30+ LLM providers:
Your Environment:
┌─────────────────────────────────────────┐
│ Your LLM Infrastructure │
├─────────────────────────────────────────┤
│ ✅ Llama CPP Router (pan.helu.ca) │
│ - Load-balanced inference │
│ - High availability │
│ │
│ ✅ Direct Llama CPP (nyx.helu.ca) │
│ - Direct connection option │
│ - Lower latency │
│ │
│ ✅ LLM Proxy - Arke (circe.helu.ca) │
│ - Unified API gateway │
│ - Request routing │
│ │
│ ✅ AWS Bedrock (optional) │
│ - Claude, Titan models │
│ - Enterprise-grade │
└─────────────────────────────────────────┘
What this means:
- Switch between providers without changing your application
- Use different models for different workspaces
- Fallback to alternative providers if one fails
- Compare model performance side-by-side
2. Document Intelligence (RAG)
AnythingLLM can ingest and understand your documents:
Supported Formats:
- 📄 PDF, DOCX, TXT, MD
- 🌐 Websites (scraping)
- 📊 CSV, JSON
- 🎥 YouTube transcripts
- 🔗 GitHub repositories
- 📝 Confluence, Notion exports
How it works:
Your Document → Text Extraction → Chunking → Embeddings → Vector DB (PostgreSQL)
↓
User Question → Embedding → Similarity Search → Relevant Chunks → LLM → Answer
Example Use Case:
You: "What's our refund policy?"
AnythingLLM: [Searches your policy documents]
"According to your Terms of Service (page 12),
refunds are available within 30 days..."
3. AI Agents with Tools 🤖
This is where AnythingLLM becomes truly powerful. Agents can:
Built-in Agent Tools:
- 🌐 Web Browsing - Navigate websites, fill forms, take screenshots
- 🔍 Web Scraping - Extract data from web pages
- 📊 SQL Agent - Query databases (PostgreSQL, MySQL, MSSQL)
- 📈 Chart Generation - Create visualizations
- 💾 File Operations - Save and manage files
- 📝 Document Summarization - Condense long documents
- 🧠 Memory - Remember context across conversations
Agent Workflow Example:
User: "Check our database for users who signed up last week
and send them a welcome email"
Agent:
1. Uses SQL Agent to query PostgreSQL
2. Retrieves user list
3. Generates personalized email content
4. (With email MCP) Sends emails
5. Reports back with results
4. Model Context Protocol (MCP) 🔌
MCP is AnythingLLM's superpower - it allows you to extend the AI with custom tools and data sources.
What is MCP?
MCP is a standardized protocol for connecting AI systems to external tools and data. Think of it as "plugins for AI."
Your MCP Possibilities:
Example 1: Docker Management
// MCP Server: docker-mcp
Tools Available:
- list_containers()
- start_container(name)
- stop_container(name)
- view_logs(container)
- exec_command(container, command)
User: "Show me all running containers and restart the one using most memory"
Agent: [Uses docker-mcp tools to check, analyze, and restart]
Example 2: GitHub Integration
// MCP Server: github-mcp
Tools Available:
- create_issue(repo, title, body)
- search_code(query)
- create_pr(repo, branch, title)
- list_repos()
User: "Create a GitHub issue for the bug I just described"
Agent: [Uses github-mcp to create issue with details]
Example 3: Custom Business Tools
// Your Custom MCP Server
Tools Available:
- query_crm(customer_id)
- check_inventory(product_sku)
- create_order(customer, items)
- send_notification(user, message)
User: "Check if we have product XYZ in stock and notify me if it's low"
Agent: [Uses your custom MCP tools]
MCP Architecture in AnythingLLM:
┌─────────────────────────────────────────────────────────┐
│ AnythingLLM │
│ ┌─────────────────────────────────────────────────┐ │
│ │ Agent System │ │
│ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ │
│ │ │ Built-in │ │ MCP │ │ Custom │ │ │
│ │ │ Tools │ │ Tools │ │ Flows │ │ │
│ │ └──────────┘ └──────────┘ └──────────┘ │ │
│ └─────────────────────────────────────────────────┘ │
│ ↓ │
│ ┌─────────────────────────────────────────────────┐ │
│ │ MCP Hypervisor │ │
│ │ - Manages MCP server lifecycle │ │
│ │ - Handles stdio/http/sse transports │ │
│ │ - Auto-discovers tools │ │
│ └─────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────┐
│ MCP Servers (Running Locally or Remote) │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ Docker │ │ GitHub │ │ Custom │ │
│ │ MCP │ │ MCP │ │ MCP │ │
│ └──────────┘ └──────────┘ └──────────┘ │
└─────────────────────────────────────────────────────────┘
Key Features:
- ✅ Hot-reload - Add/remove MCP servers without restarting
- ✅ Multiple transports - stdio, HTTP, Server-Sent Events
- ✅ Auto-discovery - Tools automatically appear in agent
- ✅ Process management - Automatic start/stop/restart
- ✅ Error handling - Graceful failures with logging
5. Agent Flows 🔄
Create no-code agent workflows for complex tasks:
┌─────────────────────────────────────────┐
│ Example Flow: "Daily Report Generator" │
├─────────────────────────────────────────┤
│ 1. Query database for yesterday's data │
│ 2. Generate summary statistics │
│ 3. Create visualization charts │
│ 4. Write report to document │
│ 5. Send via email (MCP) │
└─────────────────────────────────────────┘
Flows can be:
- Triggered manually
- Scheduled (via external cron)
- Called from other agents
- Shared across workspaces
🏗️ How AnythingLLM Fits Your Environment
Your Complete Stack:
┌─────────────────────────────────────────────────────────────────┐
│ Internet │
└────────────────────────────┬────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────────┐
│ HAProxy (SSL Termination & Load Balancing) │
│ - HTTPS/WSS support │
│ - Security headers │
│ - Health checks │
└────────────────────────────┬────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────────┐
│ AnythingLLM Application │
│ ┌─────────────────┐ ┌─────────────────┐ ┌────────────────┐ │
│ │ Web UI │ │ API Server │ │ Agent Engine │ │
│ │ - React │ │ - Express.js │ │ - AIbitat │ │
│ │ - WebSocket │ │ - REST API │ │ - MCP Support │ │
│ └─────────────────┘ └─────────────────┘ └────────────────┘ │
└────────────────────────────┬────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────────┐
│ Data Layer │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ PostgreSQL 17 + pgvector │ │
│ │ - User data & workspaces │ │
│ │ - Chat history │ │
│ │ - Vector embeddings (for RAG) │ │
│ │ - Agent invocations │ │
│ └──────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────────┐
│ External LLM Services │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ Llama Router │ │ Direct Llama │ │ LLM Proxy │ │
│ │ pan.helu.ca │ │ nyx.helu.ca │ │ circe.helu.ca│ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
└─────────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────────┐
│ TTS Service │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ FastKokoro (OpenAI-compatible TTS) │ │
│ │ pan.helu.ca:22070 │ │
│ │ - Text-to-speech generation │ │
│ │ - Multiple voices │ │
│ └──────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
Observability Stack:
┌─────────────────────────────────────────────────────────────────┐
│ Monitoring & Logging │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ Grafana (Unified Dashboard) │ │
│ │ - Metrics visualization │ │
│ │ - Log exploration │ │
│ │ - Alerting │ │
│ └────────────┬─────────────────────────────┬────────────────┘ │
│ ↓ ↓ │
│ ┌────────────────────────┐ ┌────────────────────────┐ │
│ │ Prometheus │ │ Loki │ │
│ │ - Metrics storage │ │ - Log aggregation │ │
│ │ - Alert rules │ │ - 31-day retention │ │
│ │ - 30-day retention │ │ - Query language │ │
│ └────────────────────────┘ └────────────────────────┘ │
│ ↑ ↑ │
│ ┌────────────┴─────────────────────────────┴────────────────┐ │
│ │ Data Collection │ │
│ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ │
│ │ │ cAdvisor │ │ Postgres │ │ Alloy │ │ │
│ │ │ (Container) │ │ Exporter │ │ (Logs) │ │ │
│ │ └──────────────┘ └──────────────┘ └──────────────┘ │ │
│ └────────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
🎨 Real-World Use Cases
Use Case 1: Internal Knowledge Base
Scenario: Your team needs quick access to company documentation
Setup:
- Upload all company docs to AnythingLLM workspace
- Documents are embedded and stored in PostgreSQL
- Team members ask questions naturally
Example:
Employee: "What's the process for requesting time off?"
AnythingLLM: [Searches HR documents]
"According to the Employee Handbook, you need to:
1. Submit request via HR portal
2. Get manager approval
3. Minimum 2 weeks notice for vacations..."
Benefits:
- ✅ No more searching through SharePoint
- ✅ Instant answers with source citations
- ✅ Always up-to-date (re-sync documents)
- ✅ Multi-user access with permissions
Use Case 2: DevOps Assistant
Scenario: Manage infrastructure with natural language
Setup:
- Install Docker MCP server
- Install GitHub MCP server
- Connect to your monitoring stack
Example Conversation:
You: "Show me all containers and their resource usage"
Agent: [Uses docker-mcp + Prometheus data]
"Here are your containers:
- anythingllm: 2.1GB RAM, 45% CPU
- postgres: 1.8GB RAM, 12% CPU
- prometheus: 1.2GB RAM, 8% CPU
anythingllm is using high CPU. Would you like me to investigate?"
You: "Yes, check the logs for errors"
Agent: [Uses docker-mcp to fetch logs]
"Found 15 errors in the last hour related to LLM timeouts.
Should I create a GitHub issue?"
You: "Yes, and restart the container"
Agent: [Creates GitHub issue, restarts container]
"Done! Issue #123 created and container restarted.
CPU usage now at 15%."
Use Case 3: Customer Support Automation
Scenario: AI-powered support that can take action
Setup:
- Upload product documentation
- Connect CRM via custom MCP
- Enable SQL agent for database queries
Example:
Support Agent: "Customer John Doe says his order #12345 hasn't arrived"
AnythingLLM: [Queries database via SQL agent]
"Order #12345 shipped on Jan 5th via FedEx.
Tracking shows it's delayed due to weather.
Would you like me to:
1. Send customer an update email
2. Offer expedited shipping on next order
3. Issue a partial refund"
Support Agent: "Send update email"
AnythingLLM: [Uses email MCP]
"Email sent to john@example.com with tracking info
and apology for delay."
Use Case 4: Data Analysis Assistant
Scenario: Query your database with natural language
Setup:
- Enable SQL Agent
- Connect to PostgreSQL
- Grant read-only access
Example:
You: "Show me user signups by month for the last 6 months"
Agent: [Generates and executes SQL]
SELECT
DATE_TRUNC('month', created_at) as month,
COUNT(*) as signups
FROM users
WHERE created_at >= NOW() - INTERVAL '6 months'
GROUP BY month
ORDER BY month;
Results:
- July 2025: 145 signups
- August 2025: 203 signups
- September 2025: 187 signups
...
You: "Create a chart of this"
Agent: [Uses chart generation tool]
[Displays bar chart visualization]
🔐 Security & Privacy
Why Self-Hosted Matters:
Your Data Stays Yours:
- ✅ Documents never leave your infrastructure
- ✅ Chat history stored in your PostgreSQL
- ✅ No data sent to third parties (except chosen LLM provider)
- ✅ Full audit trail in logs (via Loki)
Access Control:
- ✅ Multi-user authentication
- ✅ Role-based permissions (Admin, User)
- ✅ Workspace-level isolation
- ✅ API key management
Network Security:
- ✅ HAProxy SSL termination
- ✅ Security headers (HSTS, CSP, etc.)
- ✅ Internal network isolation
- ✅ Firewall-friendly (only ports 80/443 exposed)
Monitoring:
- ✅ All access logged to Loki
- ✅ Failed login attempts tracked
- ✅ Resource usage monitored
- ✅ Alerts for suspicious activity
📊 Monitoring Integration
Your observability stack provides complete visibility:
What You Can Monitor:
Application Health:
Grafana Dashboard: "AnythingLLM Overview"
├─ Request Rate: 1,234 req/min
├─ Response Time: 245ms avg
├─ Error Rate: 0.3%
├─ Active Users: 23
└─ Agent Invocations: 45/hour
Resource Usage:
Container Metrics (via cAdvisor):
├─ CPU: 45% (2 cores)
├─ Memory: 2.1GB / 4GB
├─ Network: 15MB/s in, 8MB/s out
└─ Disk I/O: 120 IOPS
Database Performance:
PostgreSQL Metrics (via postgres-exporter):
├─ Connections: 45 / 100
├─ Query Time: 12ms avg
├─ Cache Hit Ratio: 98.5%
├─ Database Size: 2.3GB
└─ Vector Index Size: 450MB
LLM Provider Performance:
Custom Metrics (via HAProxy):
├─ Llama Router: 234ms avg latency
├─ Direct Llama: 189ms avg latency
├─ Arke Proxy: 267ms avg latency
└─ Success Rate: 99.2%
Log Analysis (Loki):
# Find slow LLM responses
{service="anythingllm"}
| json
| duration > 5000
# Track agent tool usage
{service="anythingllm"}
|= "agent"
|= "tool_call"
# Monitor errors by type
{service="anythingllm"}
|= "ERROR"
| json
| count by error_type
Alerting Examples:
Critical Alerts:
- 🚨 AnythingLLM container down
- 🚨 PostgreSQL connection failures
- 🚨 Disk space > 95%
- 🚨 Memory usage > 90%
Warning Alerts:
- ⚠️ High LLM response times (> 5s)
- ⚠️ Database connections > 80%
- ⚠️ Error rate > 1%
- ⚠️ Agent failures
🚀 Getting Started
Quick Start:
cd deployment
# 1. Configure environment
cp .env.example .env
nano .env # Set your LLM endpoints, passwords, etc.
# 2. Setup SSL certificates
# (See README.md for Let's Encrypt instructions)
# 3. Deploy
docker-compose up -d
# 4. Access services
# - AnythingLLM: https://your-domain.com
# - Grafana: http://localhost:3000
# - Prometheus: http://localhost:9090
First Steps in AnythingLLM:
- Create Account - First user becomes admin
- Create Workspace - Organize by project/team
- Upload Documents - Add your knowledge base
- Configure LLM - Choose your provider (already set via .env)
- Enable Agents - Turn on agent mode for tools
- Add MCP Servers - Extend with custom tools
- Start Chatting! - Ask questions, run agents
🎯 Why AnythingLLM is Powerful
Compared to ChatGPT:
| Feature | ChatGPT | AnythingLLM |
|---|---|---|
| Data Privacy | ❌ Data sent to OpenAI | ✅ Self-hosted, private |
| Custom Documents | ⚠️ Limited (ChatGPT Plus) | ✅ Unlimited RAG |
| LLM Choice | ❌ OpenAI only | ✅ 30+ providers |
| Agents | ⚠️ Limited tools | ✅ Unlimited via MCP |
| Multi-User | ❌ Individual accounts | ✅ Team workspaces |
| API Access | ⚠️ Paid tier | ✅ Full REST API |
| Monitoring | ❌ No visibility | ✅ Complete observability |
| Cost | 💰 $20/user/month | ✅ Self-hosted (compute only) |
Compared to LangChain/LlamaIndex:
| Feature | LangChain | AnythingLLM |
|---|---|---|
| Setup | 🔧 Code required | ✅ Web UI, no code |
| User Interface | ❌ Build your own | ✅ Beautiful UI included |
| Multi-User | ❌ Build your own | ✅ Built-in |
| Agents | ✅ Powerful | ✅ Equally powerful + UI |
| MCP Support | ❌ No | ✅ Native support |
| Monitoring | ❌ DIY | ✅ Integrated |
| Learning Curve | 📚 Steep | ✅ Gentle |
🎓 Advanced Capabilities
1. Workspace Isolation
Create separate workspaces for different use cases:
├─ Engineering Workspace
│ ├─ Documents: Code docs, API specs
│ ├─ LLM: Direct Llama (fast)
│ └─ Agents: GitHub MCP, Docker MCP
│
├─ Customer Support Workspace
│ ├─ Documents: Product docs, FAQs
│ ├─ LLM: Llama Router (reliable)
│ └─ Agents: CRM MCP, Email MCP
│
└─ Executive Workspace
├─ Documents: Reports, analytics
├─ LLM: AWS Bedrock Claude (best quality)
└─ Agents: SQL Agent, Chart generation
2. Embedding Strategies
AnythingLLM supports multiple embedding models:
- Native (Xenova) - Fast, runs locally
- OpenAI - High quality, requires API
- Azure OpenAI - Enterprise option
- Local AI - Self-hosted alternative
Your Setup: Using native embeddings for privacy and speed
3. Agent Chaining
Agents can call other agents:
Main Agent
├─> Research Agent (web scraping)
├─> Analysis Agent (SQL queries)
└─> Report Agent (document generation)
4. API Integration
Full REST API for programmatic access:
# Send chat message
curl -X POST https://your-domain.com/api/v1/workspace/chat \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{"message": "What is our refund policy?"}'
# Upload document
curl -X POST https://your-domain.com/api/v1/document/upload \
-H "Authorization: Bearer YOUR_API_KEY" \
-F "file=@policy.pdf"
# Invoke agent
curl -X POST https://your-domain.com/api/v1/agent/invoke \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{"prompt": "Check server status"}'
🔮 Future Possibilities
With your infrastructure, you could:
1. Voice Interface
- Use FastKokoro TTS for responses
- Add speech-to-text (Whisper)
- Create voice-controlled assistant
2. Slack/Discord Bot
- Create MCP server for messaging
- Deploy bot that uses AnythingLLM
- Team can chat with AI in Slack
3. Automated Workflows
- Scheduled agent runs (cron)
- Webhook triggers
- Event-driven automation
4. Custom Dashboards
- Embed AnythingLLM in your apps
- White-label the interface
- Custom branding
5. Multi-Modal AI
- Image analysis (with vision models)
- Document OCR
- Video transcription
📚 Summary
AnythingLLM is your AI platform that:
✅ Respects Privacy - Self-hosted, your data stays yours ✅ Flexible - 30+ LLM providers, switch anytime ✅ Intelligent - RAG for document understanding ✅ Powerful - AI agents with unlimited tools via MCP ✅ Observable - Full monitoring with Prometheus/Loki ✅ Scalable - PostgreSQL + HAProxy for production ✅ Extensible - MCP protocol for custom integrations ✅ User-Friendly - Beautiful web UI, no coding required
In your environment, it provides:
🎯 Unified AI Interface - One place for all AI interactions 🔧 DevOps Automation - Manage infrastructure with natural language 📊 Data Intelligence - Query databases, analyze trends 🤖 Autonomous Agents - Tasks that run themselves 📈 Complete Visibility - Every metric, every log, every alert 🔒 Enterprise Security - SSL, auth, audit trails, monitoring
Think of it as: Your personal AI assistant platform that can see your data, use your tools, and help your team - all while you maintain complete control.
🆘 Learn More
- Deployment Guide: README.md
- Monitoring Explained: PROMETHEUS_EXPLAINED.md
- Official Docs: https://docs.anythingllm.com
- GitHub: https://github.com/Mintplex-Labs/anything-llm
- Discord Community: https://discord.gg/6UyHPeGZAC