Deploying Ollama in Docker gives you a portable, reproducible, production-ready local LLM environment. This step-by-step guide covers GPU passthrough, Docker Compose with Open WebUI, health checks, and production hardening.
1 cmd
Deploy with Compose
GPU
Full NVIDIA passthrough
Portable
Same config everywhere
Step 1 — Prerequisites
1
Install Docker Desktop
Download from docker.com. Ensure Docker Compose v2 is included (it is by default in modern versions).
2
NVIDIA: Install Container Toolkit
Required for GPU passthrough inside containers. Skip entirely if using CPU-only.
3
Allocate Docker Resources
In Docker Desktop settings, set at least 16GB RAM and 6+ CPU cores for good performance.
Shell — NVIDIA Container Toolkit (Ubuntu)
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey \
| sudo gpg --dearmor -o /usr/share/keyrings/nvidia-ct.gpg
sudo apt-get install -y nvidia-container-toolkit
sudo systemctl restart docker
# Verify GPU in container
docker run --rm --gpus all nvidia/cuda:11.0.3-base-ubuntu20.04 nvidia-smiStep 2 — Quick Docker Run
Shell — Basic Start
# CPU only docker run -d -p 11434:11434 -v ollama:/root/.ollama --name ollama ollama/ollama # With NVIDIA GPU (recommended) docker run -d -p 11434:11434 --gpus=all \ -v ollama:/root/.ollama --name ollama ollama/ollama # Pull a model inside the running container docker exec -it ollama ollama pull llama3.1
Step 3 — Full Docker Compose Stack
docker-compose.yml
version: '3.8' services: ollama: image: ollama/ollama:latest restart: unless-stopped ports: ["11434:11434"] volumes: [ollama_data:/root/.ollama] deploy: resources: reservations: devices: - driver: nvidia count: all capabilities: [gpu] environment: - OLLAMA_MAX_LOADED_MODELS=2 - OLLAMA_NUM_PARALLEL=4 healthcheck: test: [CMD, curl, -f, http://localhost:11434/api/tags] interval: 30s timeout: 10s retries: 3 open-webui: image: ghcr.io/open-webui/open-webui:main ports: ["3000:8080"] environment: [OLLAMA_BASE_URL=http://ollama:11434] depends_on: [ollama] restart: unless-stopped volumes: ollama_data:
Shell — Start Stack
docker compose up -d
docker compose exec ollama ollama pull llama3.1
# Open WebUI at http://localhost:3000
docker compose logs -f ollamaProduction Security
Add an nginx reverse proxy as a third Compose service to handle HTTPS and authentication. Never expose port 11434 directly on public or shared networks — Ollama has no built-in authentication.