mirror of
https://github.com/Monadical-SAS/reflector.git
synced 2026-03-21 22:56:47 +00:00
feat: add custom S3 endpoint support + Garage standalone storage
Add TRANSCRIPT_STORAGE_AWS_ENDPOINT_URL setting to enable S3-compatible backends (Garage, MinIO). When set, uses path-style addressing and routes all requests to the custom endpoint. When unset, AWS behavior is unchanged. - AwsStorage: accept aws_endpoint_url, pass to all 6 session.client() calls, configure path-style addressing and base_url - Fix 4 direct AwsStorage constructions in Hatchet workflows to pass endpoint_url (would have silently targeted wrong endpoint) - Standalone: add Garage service to docker-compose.standalone.yml, setup script initializes layout/bucket/key and writes credentials - Fix compose_cmd() bug: Mac path was missing standalone yml - garage.toml template with runtime secret generation via openssl
This commit is contained in:
@@ -6,6 +6,23 @@
|
||||
# On Mac: Ollama runs natively (Metal GPU) — no profile needed, services here unused.
|
||||
|
||||
services:
|
||||
garage:
|
||||
image: dxflrs/garage:v1.1.0
|
||||
ports:
|
||||
- "3900:3900" # S3 API
|
||||
- "3903:3903" # Admin API
|
||||
volumes:
|
||||
- garage_data:/var/lib/garage/data
|
||||
- garage_meta:/var/lib/garage/meta
|
||||
- ./data/garage.toml:/etc/garage.toml:ro
|
||||
restart: unless-stopped
|
||||
healthcheck:
|
||||
test: ["CMD", "curl", "-sf", "http://localhost:3903/health"]
|
||||
interval: 10s
|
||||
timeout: 5s
|
||||
retries: 5
|
||||
start_period: 5s
|
||||
|
||||
ollama:
|
||||
image: ollama/ollama:latest
|
||||
profiles: ["ollama-gpu"]
|
||||
@@ -42,4 +59,6 @@ services:
|
||||
retries: 5
|
||||
|
||||
volumes:
|
||||
garage_data:
|
||||
garage_meta:
|
||||
ollama_data:
|
||||
|
||||
@@ -59,20 +59,28 @@ Generates `server/.env` and `www/.env.local` with standalone defaults:
|
||||
|
||||
If env files already exist, the script only updates LLM vars — it won't overwrite your customizations.
|
||||
|
||||
### 3. Transcript storage (skip for standalone)
|
||||
### 3. Object storage (Garage)
|
||||
|
||||
Production uses AWS S3 to persist processed audio. **Not needed for standalone live/WebRTC mode.**
|
||||
Standalone uses [Garage](https://garagehq.deuxfleurs.fr/) — a lightweight S3-compatible object store running in Docker. The setup script starts Garage, initializes the layout, creates a bucket and access key, and writes the credentials to `server/.env`.
|
||||
|
||||
When `TRANSCRIPT_STORAGE_BACKEND` is unset (the default):
|
||||
- Audio stays on local disk at `DATA_DIR/{transcript_id}/audio.mp3`
|
||||
- The live pipeline skips the S3 upload step gracefully
|
||||
- Audio playback endpoint serves directly from disk
|
||||
- Post-processing (LLM summary, topics, title) works entirely from DB text
|
||||
- Diarization (speaker ID) is skipped — already disabled in standalone config (`DIARIZATION_ENABLED=false`)
|
||||
**`server/.env`** — storage settings added by the script:
|
||||
|
||||
> **Future**: if file upload or audio persistence across restarts is needed, implement a filesystem storage backend (`storage_local.py`) using the existing `Storage` plugin architecture in `reflector/storage/base.py`. No MinIO required.
|
||||
| Variable | Value | Why |
|
||||
|----------|-------|-----|
|
||||
| `TRANSCRIPT_STORAGE_BACKEND` | `aws` | Uses the S3-compatible storage driver |
|
||||
| `TRANSCRIPT_STORAGE_AWS_ENDPOINT_URL` | `http://garage:3900` | Docker-internal Garage S3 API |
|
||||
| `TRANSCRIPT_STORAGE_AWS_BUCKET_NAME` | `reflector-media` | Created by the script |
|
||||
| `TRANSCRIPT_STORAGE_AWS_REGION` | `garage` | Must match Garage config |
|
||||
| `TRANSCRIPT_STORAGE_AWS_ACCESS_KEY_ID` | *(auto-generated)* | Created by `garage key create` |
|
||||
| `TRANSCRIPT_STORAGE_AWS_SECRET_ACCESS_KEY` | *(auto-generated)* | Created by `garage key create` |
|
||||
|
||||
### 4. Transcription and diarization
|
||||
The `TRANSCRIPT_STORAGE_AWS_ENDPOINT_URL` setting enables S3-compatible backends. When set, the storage driver uses path-style addressing and routes all requests to the custom endpoint. When unset (production AWS), behavior is unchanged.
|
||||
|
||||
Garage config lives at `scripts/garage.toml` (mounted read-only into the container). Single-node, `replication_factor=1`.
|
||||
|
||||
> **Note**: Presigned URLs embed the Garage Docker hostname (`http://garage:3900`). This is fine — the server proxies S3 responses to the browser. Modal GPU workers cannot reach internal Garage, but standalone doesn't use Modal.
|
||||
|
||||
### 4. Transcription and diarization (NOT YET IMPLEMENTED)
|
||||
|
||||
Standalone uses `TRANSCRIPT_BACKEND=whisper` for local CPU-based transcription. Diarization is disabled.
|
||||
|
||||
@@ -81,10 +89,10 @@ Standalone uses `TRANSCRIPT_BACKEND=whisper` for local CPU-based transcription.
|
||||
### 5. Docker services
|
||||
|
||||
```bash
|
||||
docker compose up -d postgres redis server worker beat web
|
||||
docker compose up -d postgres redis garage server worker beat web
|
||||
```
|
||||
|
||||
All services start in a single command. No Hatchet in standalone mode — LLM processing (summaries, topics, titles) runs via Celery tasks.
|
||||
All services start in a single command. Garage is already started by step 3 but is included for idempotency. No Hatchet in standalone mode — LLM processing (summaries, topics, titles) runs via Celery tasks.
|
||||
|
||||
### 6. Database migrations
|
||||
|
||||
@@ -105,6 +113,7 @@ Verifies:
|
||||
| `web` | 3000 | Next.js frontend |
|
||||
| `postgres` | 5432 | PostgreSQL database |
|
||||
| `redis` | 6379 | Cache + Celery broker |
|
||||
| `garage` | 3900, 3903 | S3-compatible object storage (S3 API + admin API) |
|
||||
| `worker` | — | Celery worker (live pipeline post-processing) |
|
||||
| `beat` | — | Celery beat (scheduled tasks) |
|
||||
|
||||
@@ -121,7 +130,7 @@ These require external accounts and infrastructure that can't be scripted:
|
||||
|
||||
- Step 1 (Ollama/LLM) — implemented
|
||||
- Step 2 (environment files) — implemented
|
||||
- Step 3 (transcript storage) — resolved: skip for live-only mode
|
||||
- Step 3 (object storage / Garage) — implemented (`docker-compose.standalone.yml` + `setup-standalone.sh`)
|
||||
- Step 4 (transcription/diarization) — in progress by another developer
|
||||
- Steps 5-7 (Docker, migrations, health) — implemented
|
||||
- **Unified script**: `scripts/setup-standalone.sh`
|
||||
|
||||
18
scripts/garage.toml
Normal file
18
scripts/garage.toml
Normal file
@@ -0,0 +1,18 @@
|
||||
metadata_dir = "/var/lib/garage/meta"
|
||||
data_dir = "/var/lib/garage/data"
|
||||
replication_factor = 1
|
||||
|
||||
[s3_api]
|
||||
api_bind_addr = "[::]:3900"
|
||||
s3_region = "garage"
|
||||
root_domain = ".s3.garage.localhost"
|
||||
|
||||
[s3_web]
|
||||
bind_addr = "[::]:3902"
|
||||
|
||||
[admin]
|
||||
api_bind_addr = "[::]:3903"
|
||||
|
||||
[rpc]
|
||||
bind_addr = "[::]:3901"
|
||||
secret_transmitter = "__GARAGE_RPC_SECRET__"
|
||||
@@ -69,13 +69,11 @@ env_set() {
|
||||
}
|
||||
|
||||
compose_cmd() {
|
||||
local compose_files="-f $ROOT_DIR/docker-compose.yml -f $ROOT_DIR/docker-compose.standalone.yml"
|
||||
if [[ "$OS" == "Linux" ]] && [[ -n "${OLLAMA_PROFILE:-}" ]]; then
|
||||
docker compose -f "$ROOT_DIR/docker-compose.yml" \
|
||||
-f "$ROOT_DIR/docker-compose.standalone.yml" \
|
||||
--profile "$OLLAMA_PROFILE" \
|
||||
"$@"
|
||||
docker compose $compose_files --profile "$OLLAMA_PROFILE" "$@"
|
||||
else
|
||||
docker compose -f "$ROOT_DIR/docker-compose.yml" "$@"
|
||||
docker compose $compose_files "$@"
|
||||
fi
|
||||
}
|
||||
|
||||
@@ -177,8 +175,7 @@ AUTH_BACKEND=none
|
||||
# --- Transcription (local whisper) ---
|
||||
TRANSCRIPT_BACKEND=whisper
|
||||
|
||||
# --- Storage (local disk, no S3) ---
|
||||
# TRANSCRIPT_STORAGE_BACKEND is intentionally unset — audio stays on local disk
|
||||
# --- Storage (set by step_storage, Garage S3-compatible) ---
|
||||
|
||||
# --- Diarization (disabled, no backend available) ---
|
||||
DIARIZATION_ENABLED=false
|
||||
@@ -201,10 +198,67 @@ ENVEOF
|
||||
}
|
||||
|
||||
# =========================================================
|
||||
# Step 3: Generate www/.env.local
|
||||
# Step 3: Object storage (Garage)
|
||||
# =========================================================
|
||||
step_storage() {
|
||||
info "Step 3: Object storage (Garage)"
|
||||
|
||||
# Generate garage.toml from template (fill in RPC secret)
|
||||
GARAGE_TOML="$ROOT_DIR/scripts/garage.toml"
|
||||
GARAGE_TOML_RUNTIME="$ROOT_DIR/data/garage.toml"
|
||||
if [[ ! -f "$GARAGE_TOML_RUNTIME" ]]; then
|
||||
mkdir -p "$ROOT_DIR/data"
|
||||
RPC_SECRET=$(openssl rand -hex 32)
|
||||
sed "s|__GARAGE_RPC_SECRET__|${RPC_SECRET}|" "$GARAGE_TOML" > "$GARAGE_TOML_RUNTIME"
|
||||
fi
|
||||
|
||||
compose_cmd up -d garage
|
||||
|
||||
wait_for_url "http://localhost:3903/health" "Garage admin API"
|
||||
echo ""
|
||||
|
||||
# Layout: get node ID, assign, apply (skip if already applied)
|
||||
NODE_ID=$(compose_cmd exec -T garage /garage node id -q 2>/dev/null | tr -d '[:space:]')
|
||||
LAYOUT_STATUS=$(compose_cmd exec -T garage /garage layout show 2>&1 || true)
|
||||
if echo "$LAYOUT_STATUS" | grep -q "No nodes"; then
|
||||
compose_cmd exec -T garage /garage layout assign "$NODE_ID" -c 1G -z dc1
|
||||
compose_cmd exec -T garage /garage layout apply --version 1
|
||||
fi
|
||||
|
||||
# Create bucket (idempotent — skip if exists)
|
||||
if ! compose_cmd exec -T garage /garage bucket info reflector-media &>/dev/null; then
|
||||
compose_cmd exec -T garage /garage bucket create reflector-media
|
||||
fi
|
||||
|
||||
# Create key (idempotent — skip if exists)
|
||||
KEY_OUTPUT=$(compose_cmd exec -T garage /garage key info --name reflector 2>&1 || true)
|
||||
if echo "$KEY_OUTPUT" | grep -q "not found"; then
|
||||
KEY_OUTPUT=$(compose_cmd exec -T garage /garage key create --name reflector)
|
||||
fi
|
||||
|
||||
# Parse key ID and secret from output
|
||||
KEY_ID=$(echo "$KEY_OUTPUT" | grep -i "key id" | awk '{print $NF}')
|
||||
KEY_SECRET=$(echo "$KEY_OUTPUT" | grep -i "secret" | awk '{print $NF}')
|
||||
|
||||
# Grant bucket permissions (idempotent)
|
||||
compose_cmd exec -T garage /garage bucket allow reflector-media --read --write --key reflector
|
||||
|
||||
# Set env vars
|
||||
env_set "$SERVER_ENV" "TRANSCRIPT_STORAGE_BACKEND" "aws"
|
||||
env_set "$SERVER_ENV" "TRANSCRIPT_STORAGE_AWS_ENDPOINT_URL" "http://garage:3900"
|
||||
env_set "$SERVER_ENV" "TRANSCRIPT_STORAGE_AWS_BUCKET_NAME" "reflector-media"
|
||||
env_set "$SERVER_ENV" "TRANSCRIPT_STORAGE_AWS_REGION" "garage"
|
||||
env_set "$SERVER_ENV" "TRANSCRIPT_STORAGE_AWS_ACCESS_KEY_ID" "$KEY_ID"
|
||||
env_set "$SERVER_ENV" "TRANSCRIPT_STORAGE_AWS_SECRET_ACCESS_KEY" "$KEY_SECRET"
|
||||
|
||||
ok "Object storage ready (Garage)"
|
||||
}
|
||||
|
||||
# =========================================================
|
||||
# Step 4: Generate www/.env.local
|
||||
# =========================================================
|
||||
step_www_env() {
|
||||
info "Step 3: Generating www/.env.local"
|
||||
info "Step 4: Generating www/.env.local"
|
||||
|
||||
if [[ -f "$WWW_ENV" ]]; then
|
||||
ok "www/.env.local already exists — skipping"
|
||||
@@ -232,22 +286,22 @@ ENVEOF
|
||||
}
|
||||
|
||||
# =========================================================
|
||||
# Step 4: Start all services
|
||||
# Step 5: Start all services
|
||||
# =========================================================
|
||||
step_services() {
|
||||
info "Step 4: Starting Docker services"
|
||||
info "Step 5: Starting Docker services"
|
||||
|
||||
# server runs alembic migrations on startup automatically (see runserver.sh)
|
||||
compose_cmd up -d postgres redis server worker beat web
|
||||
compose_cmd up -d postgres redis garage server worker beat web
|
||||
ok "Containers started"
|
||||
info "Server is running migrations (alembic upgrade head)..."
|
||||
}
|
||||
|
||||
# =========================================================
|
||||
# Step 5: Health checks
|
||||
# Step 6: Health checks
|
||||
# =========================================================
|
||||
step_health() {
|
||||
info "Step 5: Health checks"
|
||||
info "Step 6: Health checks"
|
||||
|
||||
wait_for_url "http://localhost:1250/health" "Server API" 60 3
|
||||
echo ""
|
||||
@@ -292,6 +346,8 @@ main() {
|
||||
echo ""
|
||||
step_server_env
|
||||
echo ""
|
||||
step_storage
|
||||
echo ""
|
||||
step_www_env
|
||||
echo ""
|
||||
step_services
|
||||
|
||||
@@ -171,11 +171,13 @@ async def set_workflow_error_status(transcript_id: NonEmptyString) -> bool:
|
||||
|
||||
def _spawn_storage():
|
||||
"""Create fresh storage instance."""
|
||||
# TODO: replace direct AwsStorage construction with get_transcripts_storage() factory
|
||||
return AwsStorage(
|
||||
aws_bucket_name=settings.TRANSCRIPT_STORAGE_AWS_BUCKET_NAME,
|
||||
aws_region=settings.TRANSCRIPT_STORAGE_AWS_REGION,
|
||||
aws_access_key_id=settings.TRANSCRIPT_STORAGE_AWS_ACCESS_KEY_ID,
|
||||
aws_secret_access_key=settings.TRANSCRIPT_STORAGE_AWS_SECRET_ACCESS_KEY,
|
||||
aws_endpoint_url=settings.TRANSCRIPT_STORAGE_AWS_ENDPOINT_URL,
|
||||
)
|
||||
|
||||
|
||||
|
||||
@@ -49,11 +49,13 @@ async def pad_track(input: PaddingInput, ctx: Context) -> PadTrackResult:
|
||||
from reflector.settings import settings # noqa: PLC0415
|
||||
from reflector.storage.storage_aws import AwsStorage # noqa: PLC0415
|
||||
|
||||
# TODO: replace direct AwsStorage construction with get_transcripts_storage() factory
|
||||
storage = AwsStorage(
|
||||
aws_bucket_name=settings.TRANSCRIPT_STORAGE_AWS_BUCKET_NAME,
|
||||
aws_region=settings.TRANSCRIPT_STORAGE_AWS_REGION,
|
||||
aws_access_key_id=settings.TRANSCRIPT_STORAGE_AWS_ACCESS_KEY_ID,
|
||||
aws_secret_access_key=settings.TRANSCRIPT_STORAGE_AWS_SECRET_ACCESS_KEY,
|
||||
aws_endpoint_url=settings.TRANSCRIPT_STORAGE_AWS_ENDPOINT_URL,
|
||||
)
|
||||
|
||||
source_url = await storage.get_file_url(
|
||||
|
||||
@@ -60,6 +60,7 @@ async def pad_track(input: TrackInput, ctx: Context) -> PadTrackResult:
|
||||
|
||||
try:
|
||||
# Create fresh storage instance to avoid aioboto3 fork issues
|
||||
# TODO: replace direct AwsStorage construction with get_transcripts_storage() factory
|
||||
from reflector.settings import settings # noqa: PLC0415
|
||||
from reflector.storage.storage_aws import AwsStorage # noqa: PLC0415
|
||||
|
||||
@@ -68,6 +69,7 @@ async def pad_track(input: TrackInput, ctx: Context) -> PadTrackResult:
|
||||
aws_region=settings.TRANSCRIPT_STORAGE_AWS_REGION,
|
||||
aws_access_key_id=settings.TRANSCRIPT_STORAGE_AWS_ACCESS_KEY_ID,
|
||||
aws_secret_access_key=settings.TRANSCRIPT_STORAGE_AWS_SECRET_ACCESS_KEY,
|
||||
aws_endpoint_url=settings.TRANSCRIPT_STORAGE_AWS_ENDPOINT_URL,
|
||||
)
|
||||
|
||||
source_url = await storage.get_file_url(
|
||||
@@ -159,6 +161,7 @@ async def transcribe_track(input: TrackInput, ctx: Context) -> TranscribeTrackRe
|
||||
raise ValueError("Missing padded_key from pad_track")
|
||||
|
||||
# Presign URL on demand (avoids stale URLs on workflow replay)
|
||||
# TODO: replace direct AwsStorage construction with get_transcripts_storage() factory
|
||||
from reflector.settings import settings # noqa: PLC0415
|
||||
from reflector.storage.storage_aws import AwsStorage # noqa: PLC0415
|
||||
|
||||
@@ -167,6 +170,7 @@ async def transcribe_track(input: TrackInput, ctx: Context) -> TranscribeTrackRe
|
||||
aws_region=settings.TRANSCRIPT_STORAGE_AWS_REGION,
|
||||
aws_access_key_id=settings.TRANSCRIPT_STORAGE_AWS_ACCESS_KEY_ID,
|
||||
aws_secret_access_key=settings.TRANSCRIPT_STORAGE_AWS_SECRET_ACCESS_KEY,
|
||||
aws_endpoint_url=settings.TRANSCRIPT_STORAGE_AWS_ENDPOINT_URL,
|
||||
)
|
||||
|
||||
audio_url = await storage.get_file_url(
|
||||
|
||||
@@ -49,6 +49,7 @@ class Settings(BaseSettings):
|
||||
TRANSCRIPT_STORAGE_AWS_REGION: str = "us-east-1"
|
||||
TRANSCRIPT_STORAGE_AWS_ACCESS_KEY_ID: str | None = None
|
||||
TRANSCRIPT_STORAGE_AWS_SECRET_ACCESS_KEY: str | None = None
|
||||
TRANSCRIPT_STORAGE_AWS_ENDPOINT_URL: str | None = None
|
||||
|
||||
# Platform-specific recording storage (follows {PREFIX}_STORAGE_AWS_{CREDENTIAL} pattern)
|
||||
# Whereby storage configuration
|
||||
|
||||
@@ -53,6 +53,7 @@ class AwsStorage(Storage):
|
||||
aws_access_key_id: str | None = None,
|
||||
aws_secret_access_key: str | None = None,
|
||||
aws_role_arn: str | None = None,
|
||||
aws_endpoint_url: str | None = None,
|
||||
):
|
||||
if not aws_bucket_name:
|
||||
raise ValueError("Storage `aws_storage` require `aws_bucket_name`")
|
||||
@@ -73,17 +74,26 @@ class AwsStorage(Storage):
|
||||
self._access_key_id = aws_access_key_id
|
||||
self._secret_access_key = aws_secret_access_key
|
||||
self._role_arn = aws_role_arn
|
||||
self._endpoint_url = aws_endpoint_url
|
||||
|
||||
self.aws_folder = ""
|
||||
if "/" in aws_bucket_name:
|
||||
self._bucket_name, self.aws_folder = aws_bucket_name.split("/", 1)
|
||||
self.boto_config = Config(retries={"max_attempts": 3, "mode": "adaptive"})
|
||||
|
||||
config_kwargs: dict = {"retries": {"max_attempts": 3, "mode": "adaptive"}}
|
||||
if aws_endpoint_url:
|
||||
config_kwargs["s3"] = {"addressing_style": "path"}
|
||||
self.boto_config = Config(**config_kwargs)
|
||||
|
||||
self.session = aioboto3.Session(
|
||||
aws_access_key_id=aws_access_key_id,
|
||||
aws_secret_access_key=aws_secret_access_key,
|
||||
region_name=aws_region,
|
||||
)
|
||||
self.base_url = f"https://{self._bucket_name}.s3.amazonaws.com/"
|
||||
if aws_endpoint_url:
|
||||
self.base_url = f"{aws_endpoint_url}/{self._bucket_name}/"
|
||||
else:
|
||||
self.base_url = f"https://{self._bucket_name}.s3.amazonaws.com/"
|
||||
|
||||
# Implement credential properties
|
||||
@property
|
||||
@@ -139,7 +149,9 @@ class AwsStorage(Storage):
|
||||
s3filename = f"{folder}/{filename}" if folder else filename
|
||||
logger.info(f"Uploading {filename} to S3 {actual_bucket}/{folder}")
|
||||
|
||||
async with self.session.client("s3", config=self.boto_config) as client:
|
||||
async with self.session.client(
|
||||
"s3", config=self.boto_config, endpoint_url=self._endpoint_url
|
||||
) as client:
|
||||
if isinstance(data, bytes):
|
||||
await client.put_object(Bucket=actual_bucket, Key=s3filename, Body=data)
|
||||
else:
|
||||
@@ -162,7 +174,9 @@ class AwsStorage(Storage):
|
||||
actual_bucket = bucket or self._bucket_name
|
||||
folder = self.aws_folder
|
||||
s3filename = f"{folder}/{filename}" if folder else filename
|
||||
async with self.session.client("s3", config=self.boto_config) as client:
|
||||
async with self.session.client(
|
||||
"s3", config=self.boto_config, endpoint_url=self._endpoint_url
|
||||
) as client:
|
||||
presigned_url = await client.generate_presigned_url(
|
||||
operation,
|
||||
Params={"Bucket": actual_bucket, "Key": s3filename},
|
||||
@@ -177,7 +191,9 @@ class AwsStorage(Storage):
|
||||
folder = self.aws_folder
|
||||
logger.info(f"Deleting {filename} from S3 {actual_bucket}/{folder}")
|
||||
s3filename = f"{folder}/{filename}" if folder else filename
|
||||
async with self.session.client("s3", config=self.boto_config) as client:
|
||||
async with self.session.client(
|
||||
"s3", config=self.boto_config, endpoint_url=self._endpoint_url
|
||||
) as client:
|
||||
await client.delete_object(Bucket=actual_bucket, Key=s3filename)
|
||||
|
||||
@handle_s3_client_errors("download")
|
||||
@@ -186,7 +202,9 @@ class AwsStorage(Storage):
|
||||
folder = self.aws_folder
|
||||
logger.info(f"Downloading {filename} from S3 {actual_bucket}/{folder}")
|
||||
s3filename = f"{folder}/{filename}" if folder else filename
|
||||
async with self.session.client("s3", config=self.boto_config) as client:
|
||||
async with self.session.client(
|
||||
"s3", config=self.boto_config, endpoint_url=self._endpoint_url
|
||||
) as client:
|
||||
response = await client.get_object(Bucket=actual_bucket, Key=s3filename)
|
||||
return await response["Body"].read()
|
||||
|
||||
@@ -201,7 +219,9 @@ class AwsStorage(Storage):
|
||||
logger.info(f"Listing objects from S3 {actual_bucket} with prefix '{s3prefix}'")
|
||||
|
||||
keys = []
|
||||
async with self.session.client("s3", config=self.boto_config) as client:
|
||||
async with self.session.client(
|
||||
"s3", config=self.boto_config, endpoint_url=self._endpoint_url
|
||||
) as client:
|
||||
paginator = client.get_paginator("list_objects_v2")
|
||||
async for page in paginator.paginate(Bucket=actual_bucket, Prefix=s3prefix):
|
||||
if "Contents" in page:
|
||||
@@ -227,7 +247,9 @@ class AwsStorage(Storage):
|
||||
folder = self.aws_folder
|
||||
logger.info(f"Streaming {filename} from S3 {actual_bucket}/{folder}")
|
||||
s3filename = f"{folder}/{filename}" if folder else filename
|
||||
async with self.session.client("s3", config=self.boto_config) as client:
|
||||
async with self.session.client(
|
||||
"s3", config=self.boto_config, endpoint_url=self._endpoint_url
|
||||
) as client:
|
||||
await client.download_fileobj(
|
||||
Bucket=actual_bucket, Key=s3filename, Fileobj=fileobj
|
||||
)
|
||||
|
||||
@@ -319,3 +319,51 @@ def test_aws_storage_constructor_rejects_mixed_auth():
|
||||
aws_secret_access_key="test-secret",
|
||||
aws_role_arn="arn:aws:iam::123456789012:role/test-role",
|
||||
)
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_aws_storage_custom_endpoint_url():
|
||||
"""Test that custom endpoint_url configures path-style addressing and passes endpoint to client."""
|
||||
storage = AwsStorage(
|
||||
aws_bucket_name="reflector-media",
|
||||
aws_region="garage",
|
||||
aws_access_key_id="GKtest",
|
||||
aws_secret_access_key="secret",
|
||||
aws_endpoint_url="http://garage:3900",
|
||||
)
|
||||
assert storage._endpoint_url == "http://garage:3900"
|
||||
assert storage.boto_config.s3["addressing_style"] == "path"
|
||||
assert storage.base_url == "http://garage:3900/reflector-media/"
|
||||
# retries config preserved (merge, not replace)
|
||||
assert storage.boto_config.retries["max_attempts"] == 3
|
||||
|
||||
mock_client = AsyncMock()
|
||||
mock_client.put_object = AsyncMock()
|
||||
mock_client.__aenter__ = AsyncMock(return_value=mock_client)
|
||||
mock_client.__aexit__ = AsyncMock(return_value=None)
|
||||
mock_client.generate_presigned_url = AsyncMock(
|
||||
return_value="http://garage:3900/reflector-media/test.txt"
|
||||
)
|
||||
|
||||
with patch.object(
|
||||
storage.session, "client", return_value=mock_client
|
||||
) as mock_session_client:
|
||||
await storage.put_file("test.txt", b"data")
|
||||
mock_session_client.assert_called_with(
|
||||
"s3", config=storage.boto_config, endpoint_url="http://garage:3900"
|
||||
)
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_aws_storage_none_endpoint_url():
|
||||
"""Test that None endpoint preserves current AWS behavior."""
|
||||
storage = AwsStorage(
|
||||
aws_bucket_name="reflector-bucket",
|
||||
aws_region="us-east-1",
|
||||
aws_access_key_id="AKIAtest",
|
||||
aws_secret_access_key="secret",
|
||||
)
|
||||
assert storage._endpoint_url is None
|
||||
assert storage.base_url == "https://reflector-bucket.s3.amazonaws.com/"
|
||||
# No s3 addressing_style override — boto_config should only have retries
|
||||
assert not hasattr(storage.boto_config, "s3") or storage.boto_config.s3 is None
|
||||
|
||||
Reference in New Issue
Block a user