Local LLM Infrastructure: Integrating Ollama and Open WebUI via Docker
In the operation of Large Language Models (LLMs) within local environments, direct library installation on the host OS presents a high risk of dependency conflicts and GPU driver inconsistencies. Environment isolation and reproducibility are critical during research and development phases involving multiple models. This technical log details the methodology for integrating the Ollama inference engine with Open WebUI using Docker containers to establish a secure, portable private AI infrastructure.
Rationality of Configuration and Significance of Containerization
Deploying Open WebUI via Docker is a standard practice in infrastructure management rather than a mere convenience. Containerization facilitates the management of persistent data volumes and secure access to inference endpoints through the host gateway without compromising the host-side network stack or file system. This approach ensures a scalable interface while preventing configuration errors that might necessitate OS reinstallation.
Deployment Workflow
1. Preparation of Docker Runtime and Verification of Virtualization
Verify the correct operation of the container runtime. In Windows environments, the WSL2 (Windows Subsystem for Linux) backend is mandatory.
- 💡 Enabling Virtualization: Ensure Virtualization Technology (VT-x or AMD-V) is enabled in BIOS/UEFI settings. Docker Engine initialization will fail if this is disabled.
- 🛠️ Binary Verification: Execute terminal commands to confirm path configurations.
docker --version
2. Running the Open WebUI Container
With the Ollama service active on the host machine, initiate Open WebUI. Network flags for host-to-container communication are essential for establishing the API bridge.
docker run -d -p 3000:8080 \
--add-host=host.docker.internal:host-gateway \
-v open-webui:/app/backend/data \
--name open-webui \
--restart always \
ghcr.io/open-webui/open-webui:main
Technical Explanation of Key Parameters:
- -p 3000:8080: Maps host port 3000 to container port 8080.
- –add-host=host.docker.internal:host-gateway: Establishes a bridge to access the host-side Ollama API from within the container environment.
- -v open-webui:/app/backend/data: Defines a named volume for persistent chat history and user settings, ensuring data survival across container lifecycles.
- –restart always: Ensures automatic container recovery upon system reboot or unexpected process termination.
Integration with Ollama and Model Management
Access port 3000 via a web browser and configure an administrator account. Data is stored locally in SQLite or PostgreSQL, ensuring no external leakage of sensitive prompts.
- Connection Verification: Validate the Ollama connection status in the settings menu via the host.docker.internal endpoint.
- Pulling Models: Download required models (e.g., llama3:8b) through the UI. The Llama 3 8B model requires approximately 4.7GB of storage.
Troubleshooting
Common operational friction points and their respective technical solutions:
- ⚠️ Port Conflict (Port 3000): If port 3000 is occupied by another service, modify the host-side port mapping (e.g., -p 3001:8080).
- ⚠️ Connection Refused: If Open WebUI cannot reach Ollama, ensure the host-side Ollama service allows external connections by setting the environment variable OLLAMA_HOST=0.0.0.0.
- ⚠️ GPU Offload Failure: Low inference speeds (1-2 tokens/s) indicate insufficient VRAM or CPU-only operation. Verify “Dedicated GPU Memory” in Task Manager. An 8B model is recommended for 8GB VRAM; 70B models require 16GB or more for stable performance.
Verification of Operational Status
Confirm container integrity and network connectivity to ensure the infrastructure is ready for inference tasks.
# Check container status
$ docker ps --filter "name=open-webui"
CONTAINER ID IMAGE COMMAND STATUS PORTS NAMES
7f8e9d0c1b2a ghcr.io/open-webui/open-webui:main "/app/backend/start.…" Up 15 minutes 0.0.0.0:3000->8080/tcp open-webui
# Check host port listening status
$ ss -tulpn | grep :3000
tcp LISTEN 0 4096 0.0.0.0:3000 0.0.0.0:* users:(("docker-proxy",pid=1234,fd=4))
# Verify connectivity to the API endpoint
$ curl -I http://localhost:3000
HTTP/1.1 200 OK
Content-Type: text/html; charset=utf-8
Content-Length: 1234
Operational Notes
Building a local LLM environment provides significant security advantages, enabling the processing of confidential code and internal documents offline while eliminating subscription costs. Docker provides an abstraction layer that simplifies future hardware upgrades and migrations. In environments with 16GB+ VRAM, Llama 3 70B class models can operate at practical speeds for advanced inference tasks, fully contained within the private network.