Wishlist 0 ¥0.00

Deploying AnythingLLM for Website Customer Support on Windows Server

This article provides a comprehensive guide to deploying AnythingLLM, an open-source AI-powered knowledge base tool, on a Windows Server (2019 or 2022) without Docker, to serve as a website customer support chatbot. It covers two deployment methods—using the AnythingLLMDesktop.exe (desktop application) and Node.js manual deployment—both enabling web access and integration with free local AI models (e.g., Llama3 via Ollama). The guide addresses whether both methods can coexist, their differences, and practical steps for setup, tailored for users seeking a self-hosted, privacy-focused solution with no additional model costs.

Why AnythingLLM for Website Customer Support?

AnythingLLM is ideal for website customer support due to its ability to:

  • Manage Knowledge Bases: Upload FAQs, product manuals (PDF, TXT, Markdown), and create Retrieval-Augmented Generation (RAG) workflows for accurate responses.
  • Integrate Free AI Models: Use open-source models like Llama3 via Ollama, avoiding costs of commercial APIs (e.g., OpenAI).
  • Provide Web Access: Offer a web interface (default port 3001) for management and an embeddable JavaScript chatbot widget for websites.
  • Ensure Privacy: Run entirely locally, keeping data on your server.

The tool supports non-Docker deployment on Windows Server, making it accessible for users avoiding containerization due to compatibility or complexity issues.

Prerequisites

Before deployment, ensure your Windows Server (2019/2022) meets these requirements:

  • Hardware: Minimum 8GB RAM, 4-core CPU, 10GB storage (for Llama3 model and data). An NVIDIA GPU (4GB+ VRAM) is optional for faster AI inference.
  • Software: Windows Server 2019 or 2022, updated with the latest patches (check via winver).
  • Network: Internet access for initial setup; firewall configured to allow ports 3001 (AnythingLLM) and 11434 (Ollama).
  • Website: A static or dynamic website (e.g., WordPress) to embed the chatbot widget via HTML/JavaScript.

Deployment Methods

AnythingLLM can be deployed in two ways, both supporting web access and website integration. Below are the steps for each, followed by a comparison and guidance on running them simultaneously.

Method 1: AnythingLLMDesktop.exe (Desktop Application)

The AnythingLLMDesktop.exe is a pre-packaged Windows application that simplifies deployment by bundling Node.js, Electron, and dependencies into a single executable.

Installation Steps

  1. Download:
    • Visit AnythingLLM Desktop and download the latest AnythingLLMDesktop.exe (approx. 100-200MB).
  2. Install:
    • Double-click the .exe file and follow the installation wizard (default path: C:\Program Files\AnythingLLM).
    • If Windows Defender prompts, select “More info” > “Run anyway” (common for unsigned apps).
  3. Run:
    • Launch the application via the desktop icon or Start menu.
    • It automatically starts a web server on port 3001.
    • Access the web interface at http://localhost:3001 (local) or http://<server-ip>:3001 (remote).
  4. Configure Storage:
    • Set the data storage path in the app settings (e.g., C:\AnythingLLMData\desktop).
  5. Set Up Ollama (Free AI Model):
    • Download Ollama for Windows from Ollama.
    • Run in PowerShell:
      ollama pull llama3
      
      This downloads the Llama3 8B model (~4-5GB) to C:\Users\<YourUser>\.ollama\models.
    • Verify: ollama run llama3 and test with a prompt.
    • In the AnythingLLM web interface (Settings > LLM Provider), set Ollama API to http://localhost:11434.
  6. Configure Knowledge Base:
    • Create a workspace (e.g., “CustomerSupport”).
    • Upload FAQs or product documents (PDF, TXT, Markdown).
    • Test the chatbot in the web interface by asking questions like “How do I return a product?”
  7. Embed Chatbot in Website:
    • In the web interface, go to Settings > Embed Widget, and generate the JavaScript code:
      <script src="http://<server-ip>:3001/embed/chat.js"></script>
      <div id="anything-llm-chat" data-bot-id="your-bot-id"></div>
      
    • Add this to your website’s HTML (e.g., <body> tag or via a WordPress plugin like “Insert Headers and Footers”).
  8. Firewall Configuration:
    • Open ports 3001 (AnythingLLM) and 11434 (Ollama):
      New-NetFirewallRule -Name "AnythingLLM-Desktop" -DisplayName "Allow AnythingLLM Desktop" -Protocol TCP -LocalPort 3001 -Action Allow
      New-NetFirewallRule -Name "Ollama-Desktop" -DisplayName "Allow Ollama Desktop" -Protocol TCP -LocalPort 11434 -Action Allow
      

Notes

  • Advantages: Quick setup (<5 minutes), no need to install Node.js or Git, ideal for testing or small websites.
  • Limitations: Less flexible for customization, higher memory usage (~1-2GB due to Electron), potential compatibility issues on virtualized Windows Server (e.g., Hyper-V, see GitHub Issue #752).
  • Updates: Re-download the latest .exe or use the in-app update feature.

Method 2: Node.js Manual Deployment

Manual deployment involves cloning the AnythingLLM GitHub repository and running it with Node.js, offering greater control and stability for production environments.

Installation Steps

  1. Install Dependencies:
    • Node.js: Download v18 LTS from Node.js and install. Verify: node -v and npm -v.
    • Yarn: Install globally: npm install -g yarn. Verify: yarn -v.
    • Git: Download from Git. Verify: git --version.
  2. Clone Repository:
    • Create a directory (e.g., D:\AnythingLLM\node):
      mkdir D:\AnythingLLM\node
      cd D:\AnythingLLM\node
      git clone https://github.com/Mintplex-Labs/anything-llm.git
      cd anything-llm
      
  3. Install Dependencies:
    • Run: yarn setup to install frontend, server, and collector dependencies.
    • If errors occur, try: yarn cache clean or set a domestic mirror:
      yarn config set registry https://registry.npmmirror.com
      
  4. Configure Environment:
    • Copy the example environment file: copy server\.env.example server\.env.
    • Edit server\.env (e.g., with Notepad++):
      STORAGE_DIR="D:\AnythingLLM\node\data"
      PORT=3002  # Avoid conflict with .exe
      LLM_PROVIDER=ollama
      OLLAMA_BASE_URL=http://localhost:11435
      NODE_ENV=production
      
    • Ensure the storage directory exists and has write permissions.
  5. Set Up Database (Prisma):
    • In the server directory: cd server.
    • Run:
      npx prisma generate --schema=./prisma/schema.prisma
      npx prisma migrate deploy --schema=./prisma/schema.prisma
      
      This sets up SQLite (default) or MySQL if configured.
  6. Build Frontend:
    • In the frontend directory: cd frontend.
    • Run: yarn build.
    • Copy: copy dist\* server\public /s.
  7. Run Services:
    • Server: cd server && set NODE_ENV=production && node index.js.
    • Collector: cd collector && set NODE_ENV=production && node index.js.
    • Access the web interface at http://<server-ip>:3002.
  8. Set Up Ollama:
    • Copy the Ollama installation to a new directory (e.g., C:\Program Files\Ollama2).
    • Set a different port: set OLLAMA_HOST=127.0.0.1:11435 && ollama.exe serve.
    • Pull model: set OLLAMA_HOST=127.0.0.1:11435 && ollama pull llama3.
    • In the AnythingLLM web interface, set Ollama API to http://localhost:11435.
  9. Process Management (PM2):
    • Install PM2: npm install -g pm2.
    • Create ecosystem.config.js in the root directory:
      module.exports = {
        apps: [
          { name: 'node-llm-server', script: 'server/index.js', cwd: 'server', env: { NODE_ENV: 'production' } },
          { name: 'node-llm-collector', script: 'collector/index.js', cwd: 'collector', env: { NODE_ENV: 'production' } }
        ]
      };
      
    • Run: pm2 start ecosystem.config.js && pm2 save.
  10. Firewall Configuration:
    New-NetFirewallRule -Name "AnythingLLM-Node" -DisplayName "Allow AnythingLLM Node" -Protocol TCP -LocalPort 3002 -Action Allow
    New-NetFirewallRule -Name "Ollama-Node" -DisplayName "Allow Ollama Node" -Protocol TCP -LocalPort 11435 -Action Allow
    
  11. Embed Chatbot:
    • In http://<server-ip>:3002, generate the widget code and embed it in your website’s HTML.

Notes

  • Advantages: More stable, customizable (edit source code, adjust ports), ideal for production with high concurrency.
  • Limitations: Requires technical setup (Node.js, Git, Yarn), longer initial configuration.
  • Updates: Run git pull origin main && yarn setup and restart PM2.

Running Both Methods Simultaneously

You can run both .exe and Node.js deployments on the same Windows Server, provided you isolate their configurations:

  • Ports:
    • Desktop: 3001 (AnythingLLM), 11434 (Ollama).
    • Node.js: 3002 (AnythingLLM), 11435 (Ollama).
  • Storage:
    • Desktop: C:\AnythingLLMData\desktop.
    • Node.js: D:\AnythingLLM\node\data.
  • Ollama Instances:
    • Run two Ollama instances with different ports (11434 and 11435).
    • Ensure sufficient RAM (16GB+ recommended) for two Llama3 models (~4-5GB each).
  • Use Cases:
    • Use .exe for testing new features or knowledge bases.
    • Use Node.js for production customer support with high traffic.
    • Example: One instance for FAQ chatbot, another for technical support.
  • Embedding:
    • Both generate similar JavaScript widgets, embeddable in different website sections:
      <!-- Desktop Chatbot -->
      <script src="http://<server-ip>:3001/embed/chat.js"></script>
      <div id="desktop-chat" data-bot-id="desktop-bot-id"></div>
      <!-- Node.js Chatbot -->
      <script src="http://<server-ip>:3002/embed/chat.js"></script>
      <div id="node-chat" data-bot-id="node-bot-id"></div>
      

Comparison of Deployment Methods

Aspect AnythingLLMDesktop.exe Node.js Manual Deployment
Ease of Setup Simple: Install and run (~5 minutes). Complex: Requires Node.js, Git, Yarn setup (~15-30 minutes).
Web Access Auto-starts web server on 3001. Manual start on configurable port (e.g., 3002).
Customer Support Features Identical: RAG, knowledge base, embeddable chatbot. Identical: Same features, no functional difference.
Resource Usage Higher (~1-2GB RAM due to Electron). Lower (~500MB-1GB, excluding model).
Customization Limited: GUI-based, no source access. High: Edit source, configure via .env.
Production Suitability Best for testing/small sites. Ideal for high-traffic production with PM2/Nginx.
Stability Potential issues in virtualized Server environments. More stable, WSL2 option for compatibility.

Resource Requirements

  • Minimum: 8GB RAM, 4-core CPU, 10GB storage (Llama3 8B).
  • Recommended: 16GB RAM, NVIDIA GPU (4GB+ VRAM), 20GB storage for dual deployments.
  • Firewall: Open ports 3001/3002 (AnythingLLM) and 11434/11435 (Ollama).

Troubleshooting

  • Desktop.exe:
    • Won’t Start: Check logs in %APPDATA%\AnythingLLM\logs, run as administrator, or disable Defender temporarily.
    • VM Issues: Enable VT-x/AMD-V in BIOS for virtualized servers.
  • Node.js:
    • Dependency Errors: Clear Yarn cache (yarn cache clean) or use domestic mirror.
    • Port Conflicts: Check with netstat -ano | findstr "3001 3002 11434 11435".
  • Ollama:
    • No Response: Verify with curl http://localhost:11434/api/tags or curl http://localhost:11435/api/tags.
    • Performance: Use Llama3 8B for low RAM; upgrade to 13B with 16GB+ RAM.
  • Chatbot Issues:
    • Widget Not Loading: Check website console (F12) for JavaScript errors, verify server IP/port.
    • Inaccurate Responses: Refine knowledge base documents or adjust System Prompt in AnythingLLM.

Optimizing for Website Customer Support

  • Knowledge Base:
    • Upload structured FAQs (e.g., “Returns.txt” with Q&A format).
    • Test RAG with common queries (e.g., “What’s your refund policy?”).
  • Chatbot Customization:
    • Edit System Prompt (e.g., “You are a friendly customer support bot, respond concisely and professionally”).
    • Support multi-language FAQs for global users (Llama3 handles multiple languages).
  • Production Setup:
    • Configure HTTPS with Let’s Encrypt and IIS/Nginx.
    • Use PM2 (Node.js) or Task Scheduler (.exe) for auto-start.
  • Monitoring:
    • Check logs (.exe: %APPDATA%\AnythingLLM\logs; Node.js: PM2 logs via pm2 logs).
    • Monitor CPU/RAM usage in Task Manager.

Can Both Methods Run Simultaneously?

Yes, but requires careful isolation:

  • Ports: Use 3001/11434 for .exe, 3002/11435 for Node.js.
  • Storage: Separate paths to avoid data overwrite.
  • Resources: Ensure 16GB+ RAM for two Llama3 instances.
  • Use Case: Run .exe for testing, Node.js for production, or separate chatbots for different website sections.

Recommendations

  • Small Websites/Testing: Use .exe for quick setup and minimal configuration.
  • Production/High Traffic: Use Node.js with PM2 and WSL2 for stability and scalability.
  • Simultaneous Use: Only if you need distinct instances (e.g., testing vs. production); otherwise, choose one to simplify management.
  • Next Steps:
    • Confirm your server specs (RAM, CPU, GPU) and website scale to choose the best method.
    • For production, configure HTTPS and monitor performance.
    • If issues arise, share error logs or server details for targeted troubleshooting.

This guide ensures you can deploy AnythingLLM as a robust, cost-free customer support solution, leveraging its powerful AI capabilities while maintaining full control on your Windows Server.

No comments

About Us

Since 1996, our company has been focusing on domain name registration, web hosting, server hosting, website construction, e-commerce and other Internet services, and constantly practicing the concept of "providing enterprise-level solutions and providing personalized service support". As a Dell Authorized Solution Provider, we also provide hardware product solutions associated with the company's services.
 

Contact Us

Address: No. 2, Jingwu Road, Zhengzhou City, Henan Province

Phone: 0086-371-63520088 

QQ:76257322

Website: 800188.com

E-mail: This email address is being protected from spambots. You need JavaScript enabled to view it.