Docker-compose for Ollama / Open WebUI / ComfyUI

Here is my short Ollama / Open WebUI / ComfyUI docker-compose yaml file.

I use this to run AI services on my Proxmox machine.

Thanks to https://github.com/Smyshnikof for his ComfyUI efforts. Great stuff!

You might need to tweak some parameters here as I’m working with a custom setup – 2x GPUs and 48GB of RAM

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107

# Docker Compose stack for local AI services:
# - `ollama` serves local LLMs
# - `open-webui` provides a browser UI for Ollama
# - `comfyui` provides image-generation workflows
services:
ollama:
volumes:
# Persist downloaded models and Ollama state between restarts.
- ollama:/root/.ollama
container_name: ollama
pull_policy: always
tty: true
restart: unless-stopped
image: ollama/ollama:${OLLAMA_DOCKER_TAG-latest}
environment:
# Keep models loaded briefly to reduce reload latency.
- OLLAMA_KEEP_ALIVE=5m
- OLLAMA_FLASH_ATTENTION=1
# Large context window; increase only if you have enough VRAM.
- OLLAMA_CONTEXT_LENGTH=128000
# Limit Ollama to the listed GPU devices.
- CUDA_VISIBLE_DEVICES=0,1
# Keep concurrency conservative to avoid VRAM exhaustion.
- OLLAMA_MAX_LOADED_MODELS=1
- OLLAMA_NUM_PARALLEL=1
ports:
# Ollama API endpoint.
- 11434:11434
deploy:
resources:
reservations:
devices:
- driver: nvidia
# Reserve both GPUs for this container.
count: 2
capabilities:
- gpu

open-webui:
build:
context: .
args:
OLLAMA_BASE_URL: "/ollama"
dockerfile: Dockerfile
image: ghcr.io/open-webui/open-webui:${WEBUI_DOCKER_TAG-main}
container_name: open-webui
pull_policy: always
volumes:
# Persist Open WebUI data (users, settings, chats).
- open-webui:/app/backend/data
# Mount TLS certificate/key from the host.
- /etc/ssl/openui:/ssl
depends_on:
- ollama
ports:
# Host port is configurable with OPEN_WEBUI_PORT; container listens on 8080.
- ${OPEN_WEBUI_PORT-3001}:8080
environment:
- "OLLAMA_BASE_URL=http://ollama:11434"
- "WEBUI_URL=https://openui.local"
# Set this to a strong non-empty value in production.
- "WEBUI_SECRET_KEY="
- "WEBUI_SSL_CERT=/ssl/openui.crt"
- "WEBUI_SSL_KEY=/ssl/openui.key"
# Wide-open CORS is convenient locally but risky outside trusted networks.
- "CORS_ALLOW_ORIGIN=*"
extra_hosts:
- host.docker.internal:host-gateway
restart: unless-stopped

comfyui:
# Prebuilt ComfyUI image with CUDA 12.8 / Torch 2.8 support.
image: smyshnikof/comfyui:full-torch2.8.0-cu128
container_name: comfyui
environment:
- NVIDIA_VISIBLE_DEVICES=all
#- INSTALL_SAGEATTENTION=True #sageattention is a custom node for comfyui that allows you to use the sage attention mechanism
ports:
# Main ComfyUI web interface.
- "8188:3000"
# Additional forwarded ports for custom nodes or side services.
- "8081:8081"
- "8082:8082"
- "8083:8083"
- "8888:8888"
volumes:
# Bind local model, output, and user data directories into the container.
- ./comfyui/models:/workspace/ComfyUI/models
- ./comfyui/output:/workspace/ComfyUI/output
- ./comfyui/user:/workspace/ComfyUI/user/default
# Preset file used by the Preset Download Manager custom node.
- ./comfyui/preset-download-manager-presets.json:/workspace/ComfyUI/custom_nodes/ComfyUI-PresetDownloadManager/presets.json
deploy:
resources:
reservations:
devices:
- driver: nvidia
# Reserve both GPUs for ComfyUI workloads.
count: 2
capabilities:
- gpu
restart: unless-stopped

volumes:
# Named volumes keep app data across container recreation.
ollama: {}
open-webui: {}

Make sure to crate the folders for ComfyUI within the directory where docker-compose.yaml file exists:

1	mkdir -p comfyui/models comfyui/output comfyui/user comfyui/flows

Leave a Reply Cancel reply