Here is my short Ollama / Open WebUI / ComfyUI docker-compose yaml file.
I use this to run AI services on my Proxmox machine.
Thanks to https://github.com/Smyshnikof for his ComfyUI efforts. Great stuff!
You might need to tweak some parameters here as I’m working with a custom setup – 2x GPUs and 48GB of RAM
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 | # Docker Compose stack for local AI services: # - `ollama` serves local LLMs # - `open-webui` provides a browser UI for Ollama # - `comfyui` provides image-generation workflows services: ollama: volumes: # Persist downloaded models and Ollama state between restarts. - ollama:/root/.ollama container_name: ollama pull_policy: always tty: true restart: unless-stopped image: ollama/ollama:${OLLAMA_DOCKER_TAG-latest} environment: # Keep models loaded briefly to reduce reload latency. - OLLAMA_KEEP_ALIVE=5m - OLLAMA_FLASH_ATTENTION=1 # Large context window; increase only if you have enough VRAM. - OLLAMA_CONTEXT_LENGTH=128000 # Limit Ollama to the listed GPU devices. - CUDA_VISIBLE_DEVICES=0,1 # Keep concurrency conservative to avoid VRAM exhaustion. - OLLAMA_MAX_LOADED_MODELS=1 - OLLAMA_NUM_PARALLEL=1 ports: # Ollama API endpoint. - 11434:11434 deploy: resources: reservations: devices: - driver: nvidia # Reserve both GPUs for this container. count: 2 capabilities: - gpu open-webui: build: context: . args: OLLAMA_BASE_URL: "/ollama" dockerfile: Dockerfile image: ghcr.io/open-webui/open-webui:${WEBUI_DOCKER_TAG-main} container_name: open-webui pull_policy: always volumes: # Persist Open WebUI data (users, settings, chats). - open-webui:/app/backend/data # Mount TLS certificate/key from the host. - /etc/ssl/openui:/ssl depends_on: - ollama ports: # Host port is configurable with OPEN_WEBUI_PORT; container listens on 8080. - ${OPEN_WEBUI_PORT-3001}:8080 environment: - "OLLAMA_BASE_URL=http://ollama:11434" - "WEBUI_URL=https://openui.local" # Set this to a strong non-empty value in production. - "WEBUI_SECRET_KEY=" - "WEBUI_SSL_CERT=/ssl/openui.crt" - "WEBUI_SSL_KEY=/ssl/openui.key" # Wide-open CORS is convenient locally but risky outside trusted networks. - "CORS_ALLOW_ORIGIN=*" extra_hosts: - host.docker.internal:host-gateway restart: unless-stopped comfyui: # Prebuilt ComfyUI image with CUDA 12.8 / Torch 2.8 support. image: smyshnikof/comfyui:full-torch2.8.0-cu128 container_name: comfyui environment: - NVIDIA_VISIBLE_DEVICES=all #- INSTALL_SAGEATTENTION=True #sageattention is a custom node for comfyui that allows you to use the sage attention mechanism ports: # Main ComfyUI web interface. - "8188:3000" # Additional forwarded ports for custom nodes or side services. - "8081:8081" - "8082:8082" - "8083:8083" - "8888:8888" volumes: # Bind local model, output, and user data directories into the container. - ./comfyui/models:/workspace/ComfyUI/models - ./comfyui/output:/workspace/ComfyUI/output - ./comfyui/user:/workspace/ComfyUI/user/default # Preset file used by the Preset Download Manager custom node. - ./comfyui/preset-download-manager-presets.json:/workspace/ComfyUI/custom_nodes/ComfyUI-PresetDownloadManager/presets.json deploy: resources: reservations: devices: - driver: nvidia # Reserve both GPUs for ComfyUI workloads. count: 2 capabilities: - gpu restart: unless-stopped volumes: # Named volumes keep app data across container recreation. ollama: {} open-webui: {} |
Make sure to crate the folders for ComfyUI within the directory where docker-compose.yaml file exists:
1 | mkdir -p comfyui/models comfyui/output comfyui/user comfyui/flows |