mirror of
https://github.com/Monadical-SAS/reflector.git
synced 2026-03-30 19:06:46 +00:00
feat: local LLM via Ollama + structured output response_format
- Add setup script (scripts/setup-local-llm.sh) for one-command Ollama setup Mac: native Metal GPU, Linux: containerized via docker-compose profiles - Add ollama-gpu and ollama-cpu docker-compose profiles for Linux - Add extra_hosts to server/hatchet-worker-llm for host.docker.internal - Pass response_format JSON schema in StructuredOutputWorkflow.extract() enabling grammar-based constrained decoding on Ollama/llama.cpp/vLLM/OpenAI - Update .env.example with Ollama as default LLM option - Add Ollama PRD and local dev setup docs
This commit is contained in:
@@ -66,15 +66,22 @@ TRANSLATE_URL=https://monadical-sas--reflector-translator-web.modal.run
|
||||
## LLM backend (Required)
|
||||
##
|
||||
## Responsible for generating titles, summaries, and topic detection
|
||||
## Requires OpenAI API key
|
||||
## Supports any OpenAI-compatible endpoint.
|
||||
## =======================================================
|
||||
|
||||
## OpenAI API key - get from https://platform.openai.com/account/api-keys
|
||||
LLM_API_KEY=sk-your-openai-api-key
|
||||
LLM_MODEL=gpt-4o-mini
|
||||
## --- Option A: Local LLM via Ollama (recommended for dev) ---
|
||||
## Setup: ./scripts/setup-local-llm.sh
|
||||
## Mac: Ollama runs natively (Metal GPU). Containers reach it via host.docker.internal.
|
||||
## Linux: docker compose --profile ollama-gpu up -d (or ollama-cpu for no GPU)
|
||||
LLM_URL=http://host.docker.internal:11434/v1
|
||||
LLM_MODEL=qwen2.5:14b
|
||||
LLM_API_KEY=not-needed
|
||||
## Linux with containerized Ollama: LLM_URL=http://ollama:11434/v1
|
||||
|
||||
## Optional: Custom endpoint (defaults to OpenAI)
|
||||
# LLM_URL=https://api.openai.com/v1
|
||||
## --- Option B: Remote/cloud LLM ---
|
||||
#LLM_API_KEY=sk-your-openai-api-key
|
||||
#LLM_MODEL=gpt-4o-mini
|
||||
## LLM_URL defaults to OpenAI when unset
|
||||
|
||||
## Context size for summary generation (tokens)
|
||||
LLM_CONTEXT_WINDOW=16000
|
||||
|
||||
Reference in New Issue
Block a user