mirror of
https://github.com/Monadical-SAS/reflector.git
synced 2026-03-22 07:06:47 +00:00
refactor: move Ollama services to docker-compose.standalone.yml
Ollama profiles (ollama-gpu, ollama-cpu) are only for Linux standalone deployment. Mac devs never use them. Separate file keeps the main compose clean and provides a natural home for future standalone services (MinIO, etc.). Linux: docker compose -f docker-compose.yml -f docker-compose.standalone.yml --profile ollama-gpu up -d Mac: docker compose up -d (native Ollama, no standalone file needed)
This commit is contained in:
@@ -190,53 +190,31 @@ LLM_API_KEY=not-needed
|
||||
LLM_CONTEXT_WINDOW=16000
|
||||
```
|
||||
|
||||
### Docker Compose additions
|
||||
### Docker Compose changes
|
||||
|
||||
**`docker-compose.yml`** — `extra_hosts` added to `server` and `hatchet-worker-llm` so containers can reach host Ollama on Mac:
|
||||
```yaml
|
||||
hatchet-worker-llm:
|
||||
extra_hosts:
|
||||
- "host.docker.internal:host-gateway"
|
||||
```
|
||||
|
||||
**`docker-compose.standalone.yml`** — Ollama services for Linux (not in main compose, only used with `-f`):
|
||||
```yaml
|
||||
# Usage: docker compose -f docker-compose.yml -f docker-compose.standalone.yml --profile ollama-gpu up -d
|
||||
services:
|
||||
ollama:
|
||||
image: ollama/ollama:latest
|
||||
profiles: ["ollama-gpu"]
|
||||
ports:
|
||||
- "11434:11434"
|
||||
volumes:
|
||||
- ollama_data:/root/.ollama
|
||||
deploy:
|
||||
resources:
|
||||
reservations:
|
||||
devices:
|
||||
- driver: nvidia
|
||||
count: all
|
||||
capabilities: [gpu]
|
||||
restart: unless-stopped
|
||||
healthcheck:
|
||||
test: ["CMD", "curl", "-f", "http://localhost:11434/api/tags"]
|
||||
interval: 10s
|
||||
timeout: 5s
|
||||
retries: 5
|
||||
|
||||
# ... NVIDIA GPU passthrough
|
||||
ollama-cpu:
|
||||
image: ollama/ollama:latest
|
||||
profiles: ["ollama-cpu"]
|
||||
ports:
|
||||
- "11434:11434"
|
||||
volumes:
|
||||
- ollama_data:/root/.ollama
|
||||
restart: unless-stopped
|
||||
healthcheck:
|
||||
test: ["CMD", "curl", "-f", "http://localhost:11434/api/tags"]
|
||||
interval: 10s
|
||||
timeout: 5s
|
||||
retries: 5
|
||||
|
||||
hatchet-worker-llm:
|
||||
extra_hosts:
|
||||
- "host.docker.internal:host-gateway"
|
||||
|
||||
volumes:
|
||||
ollama_data:
|
||||
# ... CPU-only fallback
|
||||
```
|
||||
|
||||
Mac devs never touch `docker-compose.standalone.yml` — Ollama runs natively. The standalone file is for Linux deployment and will grow to include other local-only services (e.g. MinIO for S3) as the standalone story expands.
|
||||
|
||||
### Known gotchas
|
||||
|
||||
1. **OrbStack `host.docker.internal`**: OrbStack uses `host.internal` by default, but also supports `host.docker.internal` with `extra_hosts: host-gateway`.
|
||||
|
||||
@@ -27,7 +27,7 @@ The script is idempotent — safe to re-run at any time. It detects what's alrea
|
||||
|
||||
**Mac**: starts Ollama natively (Metal GPU acceleration). Pulls the LLM model. Docker containers reach it via `host.docker.internal:11434`.
|
||||
|
||||
**Linux**: starts containerized Ollama via docker-compose profile (`ollama-gpu` with NVIDIA, `ollama-cpu` without). Pulls model inside the container.
|
||||
**Linux**: starts containerized Ollama via `docker-compose.standalone.yml` profile (`ollama-gpu` with NVIDIA, `ollama-cpu` without). Pulls model inside the container.
|
||||
|
||||
Configures `server/.env`:
|
||||
```
|
||||
|
||||
Reference in New Issue
Block a user