mirror of
https://github.com/Monadical-SAS/reflector.git
synced 2026-04-22 05:05:18 +00:00
refactor: move Ollama services to docker-compose.standalone.yml
Ollama profiles (ollama-gpu, ollama-cpu) are only for Linux standalone deployment. Mac devs never use them. Separate file keeps the main compose clean and provides a natural home for future standalone services (MinIO, etc.). Linux: docker compose -f docker-compose.yml -f docker-compose.standalone.yml --profile ollama-gpu up -d Mac: docker compose up -d (native Ollama, no standalone file needed)
This commit is contained in:
45
docker-compose.standalone.yml
Normal file
45
docker-compose.standalone.yml
Normal file
@@ -0,0 +1,45 @@
|
|||||||
|
# Standalone services for fully local deployment (no external dependencies).
|
||||||
|
# Usage: docker compose -f docker-compose.yml -f docker-compose.standalone.yml up -d
|
||||||
|
#
|
||||||
|
# On Linux with NVIDIA GPU, also pass: --profile ollama-gpu
|
||||||
|
# On Linux without GPU: --profile ollama-cpu
|
||||||
|
# On Mac: Ollama runs natively (Metal GPU) — no profile needed, services here unused.
|
||||||
|
|
||||||
|
services:
|
||||||
|
ollama:
|
||||||
|
image: ollama/ollama:latest
|
||||||
|
profiles: ["ollama-gpu"]
|
||||||
|
ports:
|
||||||
|
- "11434:11434"
|
||||||
|
volumes:
|
||||||
|
- ollama_data:/root/.ollama
|
||||||
|
deploy:
|
||||||
|
resources:
|
||||||
|
reservations:
|
||||||
|
devices:
|
||||||
|
- driver: nvidia
|
||||||
|
count: all
|
||||||
|
capabilities: [gpu]
|
||||||
|
restart: unless-stopped
|
||||||
|
healthcheck:
|
||||||
|
test: ["CMD", "curl", "-f", "http://localhost:11434/api/tags"]
|
||||||
|
interval: 10s
|
||||||
|
timeout: 5s
|
||||||
|
retries: 5
|
||||||
|
|
||||||
|
ollama-cpu:
|
||||||
|
image: ollama/ollama:latest
|
||||||
|
profiles: ["ollama-cpu"]
|
||||||
|
ports:
|
||||||
|
- "11434:11434"
|
||||||
|
volumes:
|
||||||
|
- ollama_data:/root/.ollama
|
||||||
|
restart: unless-stopped
|
||||||
|
healthcheck:
|
||||||
|
test: ["CMD", "curl", "-f", "http://localhost:11434/api/tags"]
|
||||||
|
interval: 10s
|
||||||
|
timeout: 5s
|
||||||
|
retries: 5
|
||||||
|
|
||||||
|
volumes:
|
||||||
|
ollama_data:
|
||||||
@@ -132,44 +132,6 @@ services:
|
|||||||
retries: 5
|
retries: 5
|
||||||
start_period: 30s
|
start_period: 30s
|
||||||
|
|
||||||
ollama:
|
|
||||||
image: ollama/ollama:latest
|
|
||||||
profiles: ["ollama-gpu"]
|
|
||||||
ports:
|
|
||||||
- "11434:11434"
|
|
||||||
volumes:
|
|
||||||
- ollama_data:/root/.ollama
|
|
||||||
deploy:
|
|
||||||
resources:
|
|
||||||
reservations:
|
|
||||||
devices:
|
|
||||||
- driver: nvidia
|
|
||||||
count: all
|
|
||||||
capabilities: [gpu]
|
|
||||||
restart: unless-stopped
|
|
||||||
healthcheck:
|
|
||||||
test: ["CMD", "curl", "-f", "http://localhost:11434/api/tags"]
|
|
||||||
interval: 10s
|
|
||||||
timeout: 5s
|
|
||||||
retries: 5
|
|
||||||
|
|
||||||
ollama-cpu:
|
|
||||||
image: ollama/ollama:latest
|
|
||||||
profiles: ["ollama-cpu"]
|
|
||||||
ports:
|
|
||||||
- "11434:11434"
|
|
||||||
volumes:
|
|
||||||
- ollama_data:/root/.ollama
|
|
||||||
restart: unless-stopped
|
|
||||||
healthcheck:
|
|
||||||
test: ["CMD", "curl", "-f", "http://localhost:11434/api/tags"]
|
|
||||||
interval: 10s
|
|
||||||
timeout: 5s
|
|
||||||
retries: 5
|
|
||||||
|
|
||||||
volumes:
|
|
||||||
ollama_data:
|
|
||||||
|
|
||||||
networks:
|
networks:
|
||||||
default:
|
default:
|
||||||
attachable: true
|
attachable: true
|
||||||
|
|||||||
@@ -190,53 +190,31 @@ LLM_API_KEY=not-needed
|
|||||||
LLM_CONTEXT_WINDOW=16000
|
LLM_CONTEXT_WINDOW=16000
|
||||||
```
|
```
|
||||||
|
|
||||||
### Docker Compose additions
|
### Docker Compose changes
|
||||||
|
|
||||||
|
**`docker-compose.yml`** — `extra_hosts` added to `server` and `hatchet-worker-llm` so containers can reach host Ollama on Mac:
|
||||||
```yaml
|
```yaml
|
||||||
|
hatchet-worker-llm:
|
||||||
|
extra_hosts:
|
||||||
|
- "host.docker.internal:host-gateway"
|
||||||
|
```
|
||||||
|
|
||||||
|
**`docker-compose.standalone.yml`** — Ollama services for Linux (not in main compose, only used with `-f`):
|
||||||
|
```yaml
|
||||||
|
# Usage: docker compose -f docker-compose.yml -f docker-compose.standalone.yml --profile ollama-gpu up -d
|
||||||
services:
|
services:
|
||||||
ollama:
|
ollama:
|
||||||
image: ollama/ollama:latest
|
image: ollama/ollama:latest
|
||||||
profiles: ["ollama-gpu"]
|
profiles: ["ollama-gpu"]
|
||||||
ports:
|
# ... NVIDIA GPU passthrough
|
||||||
- "11434:11434"
|
|
||||||
volumes:
|
|
||||||
- ollama_data:/root/.ollama
|
|
||||||
deploy:
|
|
||||||
resources:
|
|
||||||
reservations:
|
|
||||||
devices:
|
|
||||||
- driver: nvidia
|
|
||||||
count: all
|
|
||||||
capabilities: [gpu]
|
|
||||||
restart: unless-stopped
|
|
||||||
healthcheck:
|
|
||||||
test: ["CMD", "curl", "-f", "http://localhost:11434/api/tags"]
|
|
||||||
interval: 10s
|
|
||||||
timeout: 5s
|
|
||||||
retries: 5
|
|
||||||
|
|
||||||
ollama-cpu:
|
ollama-cpu:
|
||||||
image: ollama/ollama:latest
|
image: ollama/ollama:latest
|
||||||
profiles: ["ollama-cpu"]
|
profiles: ["ollama-cpu"]
|
||||||
ports:
|
# ... CPU-only fallback
|
||||||
- "11434:11434"
|
|
||||||
volumes:
|
|
||||||
- ollama_data:/root/.ollama
|
|
||||||
restart: unless-stopped
|
|
||||||
healthcheck:
|
|
||||||
test: ["CMD", "curl", "-f", "http://localhost:11434/api/tags"]
|
|
||||||
interval: 10s
|
|
||||||
timeout: 5s
|
|
||||||
retries: 5
|
|
||||||
|
|
||||||
hatchet-worker-llm:
|
|
||||||
extra_hosts:
|
|
||||||
- "host.docker.internal:host-gateway"
|
|
||||||
|
|
||||||
volumes:
|
|
||||||
ollama_data:
|
|
||||||
```
|
```
|
||||||
|
|
||||||
|
Mac devs never touch `docker-compose.standalone.yml` — Ollama runs natively. The standalone file is for Linux deployment and will grow to include other local-only services (e.g. MinIO for S3) as the standalone story expands.
|
||||||
|
|
||||||
### Known gotchas
|
### Known gotchas
|
||||||
|
|
||||||
1. **OrbStack `host.docker.internal`**: OrbStack uses `host.internal` by default, but also supports `host.docker.internal` with `extra_hosts: host-gateway`.
|
1. **OrbStack `host.docker.internal`**: OrbStack uses `host.internal` by default, but also supports `host.docker.internal` with `extra_hosts: host-gateway`.
|
||||||
|
|||||||
@@ -27,7 +27,7 @@ The script is idempotent — safe to re-run at any time. It detects what's alrea
|
|||||||
|
|
||||||
**Mac**: starts Ollama natively (Metal GPU acceleration). Pulls the LLM model. Docker containers reach it via `host.docker.internal:11434`.
|
**Mac**: starts Ollama natively (Metal GPU acceleration). Pulls the LLM model. Docker containers reach it via `host.docker.internal:11434`.
|
||||||
|
|
||||||
**Linux**: starts containerized Ollama via docker-compose profile (`ollama-gpu` with NVIDIA, `ollama-cpu` without). Pulls model inside the container.
|
**Linux**: starts containerized Ollama via `docker-compose.standalone.yml` profile (`ollama-gpu` with NVIDIA, `ollama-cpu` without). Pulls model inside the container.
|
||||||
|
|
||||||
Configures `server/.env`:
|
Configures `server/.env`:
|
||||||
```
|
```
|
||||||
|
|||||||
@@ -69,8 +69,10 @@ case "$OS" in
|
|||||||
LLM_URL="http://ollama-cpu:$OLLAMA_PORT/v1"
|
LLM_URL="http://ollama-cpu:$OLLAMA_PORT/v1"
|
||||||
fi
|
fi
|
||||||
|
|
||||||
|
COMPOSE="docker compose -f docker-compose.yml -f docker-compose.standalone.yml"
|
||||||
|
|
||||||
echo "Starting Ollama container..."
|
echo "Starting Ollama container..."
|
||||||
docker compose --profile "$PROFILE" up -d
|
$COMPOSE --profile "$PROFILE" up -d
|
||||||
|
|
||||||
# Determine container name
|
# Determine container name
|
||||||
if [ "$PROFILE" = "ollama-gpu" ]; then
|
if [ "$PROFILE" = "ollama-gpu" ]; then
|
||||||
@@ -82,7 +84,7 @@ case "$OS" in
|
|||||||
wait_for_ollama "http://localhost:$OLLAMA_PORT"
|
wait_for_ollama "http://localhost:$OLLAMA_PORT"
|
||||||
|
|
||||||
echo "Pulling model $MODEL..."
|
echo "Pulling model $MODEL..."
|
||||||
docker compose exec "$SVC" ollama pull "$MODEL"
|
$COMPOSE exec "$SVC" ollama pull "$MODEL"
|
||||||
|
|
||||||
echo ""
|
echo ""
|
||||||
echo "Done. Add to server/.env:"
|
echo "Done. Add to server/.env:"
|
||||||
@@ -90,7 +92,7 @@ case "$OS" in
|
|||||||
echo " LLM_MODEL=$MODEL"
|
echo " LLM_MODEL=$MODEL"
|
||||||
echo " LLM_API_KEY=not-needed"
|
echo " LLM_API_KEY=not-needed"
|
||||||
echo ""
|
echo ""
|
||||||
echo "Then: docker compose --profile $PROFILE up -d"
|
echo "Then: $COMPOSE --profile $PROFILE up -d"
|
||||||
;;
|
;;
|
||||||
|
|
||||||
*)
|
*)
|
||||||
|
|||||||
Reference in New Issue
Block a user