Compare commits

...

5 Commits

Author SHA1 Message Date
end c0ff6ad635 don't use S3 CreateBucket and clean up
Docker server image / build-and-push (push) Successful in 1m6s
2026-06-17 16:58:02 -07:00
end 26cd0ef071 fix AWS S3 startup for production deployments
Docker server image / build-and-push (push) Successful in 1m7s
Pass S3_REGION and S3_PUBLIC_USE_SSL through compose, treat blank public
SSL as unset, and skip CreateBucket when IAM only allows access to an
existing bucket.
2026-06-17 11:55:53 -07:00
end 1756a4a0f8 fix protobuf version mismatch
Docker server image / build-and-push (push) Successful in 1m9s
2026-06-17 11:41:58 -07:00
end 4d8da4c910 add S3 policy docs to README
Docker server image / build-and-push (push) Successful in 1m31s
2026-06-17 11:14:44 -07:00
end 42b9ae85a8 use a single bucket rather than one per conversion 2026-06-17 11:09:52 -07:00
16 changed files with 257 additions and 146 deletions
+1
View File
@@ -1,5 +1,6 @@
S3_ENDPOINT=seaweedfs:8333
S3_PUBLIC_ENDPOINT=localhost:8333
S3_BUCKET=officeconvert
S3_USE_SSL=false
# Presigned URLs; omit to match S3_USE_SSL (internal client uses S3_ENDPOINT).
S3_PUBLIC_USE_SSL=false
+17 -2
View File
@@ -2,7 +2,7 @@ SHELL := /bin/sh
BUF ?= buf
.PHONY: buf-lint buf-generate py-sync py-test go-test test compose-up compose-up-dev run-server
.PHONY: buf-lint buf-generate py-sync py-test go-test test compose-up compose-up-dev s3-init run-server
buf-lint:
$(BUF) lint
@@ -32,7 +32,21 @@ compose-up:
compose-up-dev:
docker compose --env-file .env.example -f deploy/docker-compose.dev.yml up
run-server:
s3-init:
@set -a; \
if [ -f .env ]; then . ./.env; fi; \
set +a; \
endpoint="$${S3_ENDPOINT:-localhost:8333}"; \
case "$$endpoint" in seaweedfs:8333) endpoint=localhost:8333 ;; esac; \
bucket="$${S3_BUCKET:-officeconvert}"; \
access_key="$${S3_ACCESS_KEY:-minioadmin}"; \
secret_key="$${S3_SECRET_KEY:-minioadmin}"; \
port="$${endpoint#*:}"; \
docker run --rm --add-host=host.docker.internal:host-gateway minio/mc:latest /bin/sh -c " \
mc alias set local http://host.docker.internal:$$port '$$access_key' '$$secret_key' && \
mc mb local/$$bucket --ignore-existing"
run-server: s3-init
@set -a; \
if [ -f .env ]; then . ./.env; fi; \
set +a; \
@@ -41,6 +55,7 @@ run-server:
if [ "$${S3_PUBLIC_ENDPOINT:-}" = "seaweedfs:8333" ]; then S3_PUBLIC_ENDPOINT=localhost:8333; fi; \
export S3_ENDPOINT="$${S3_ENDPOINT:-localhost:8333}"; \
export S3_PUBLIC_ENDPOINT="$${S3_PUBLIC_ENDPOINT:-localhost:8333}"; \
export S3_BUCKET="$${S3_BUCKET:-officeconvert}"; \
export S3_USE_SSL="$${S3_USE_SSL:-false}"; \
export S3_ACCESS_KEY="$${S3_ACCESS_KEY:-minioadmin}"; \
export S3_SECRET_KEY="$${S3_SECRET_KEY:-minioadmin}"; \
+68 -2
View File
@@ -135,9 +135,75 @@ Use `.env.example` as your baseline env configuration.
## Storage Backend Notes
- This project defaults to **SeaweedFS S3 API** for object transit in development and compose deployments.
- The Python server uses the `minio` Python SDK, which is intentional because SeaweedFS is S3-compatible.
- Local development defaults to **SeaweedFS** (S3-compatible) via Docker Compose. Compose runs an `s3-init` step that creates the dev bucket before the server starts.
- Production can use any S3-compatible provider; **AWS S3** is the expected choice.
- The Python server uses the `minio` Python SDK against the S3 API.
- Runtime configuration uses `S3_*` environment variables.
- All conversions share one bucket (`S3_BUCKET`, required). Each conversion's objects live under a `{conversion_id}/` key prefix (for example `{conversion_id}/input/source.pptx` and `{conversion_id}/output/slide-0001.jpg`).
### AWS setup
**Bucket**
1. Create one bucket (for example `officeconvert-prod`) in the region where the server runs.
2. Leave **Block Public Access** enabled. Presigned URLs work without a public bucket.
3. Optional: add a lifecycle rule to expire objects after a few days as a safety net if cleanup fails.
**Server environment**
Set at minimum:
```bash
S3_BUCKET=officeconvert-prod
S3_ENDPOINT=s3.us-east-1.amazonaws.com
S3_PUBLIC_ENDPOINT=s3.us-east-1.amazonaws.com
S3_REGION=us-east-1
S3_USE_SSL=true
S3_PUBLIC_USE_SSL=true
S3_ACCESS_KEY=...
S3_SECRET_KEY=...
```
Use your bucket's regional hostname for both endpoints unless you deliberately split internal vs client-facing access. `S3_PUBLIC_ENDPOINT` must be reachable by whatever uploads and downloads via presigned URLs (clients, not just the server).
On startup the server verifies the bucket exists via HeadBucket and fails fast if it is missing. **Pre-create the bucket** before deploying (see IAM below).
**IAM permissions**
Scope access to the single bucket. Object keys are per-conversion prefixes, so list/delete can target the whole bucket. Startup verification uses HeadBucket, which is satisfied by `s3:ListBucket` on the bucket ARN:
```json
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": ["s3:ListBucket", "s3:HeadBucket"],
"Resource": "arn:aws:s3:::officeconvert-prod"
},
{
"Effect": "Allow",
"Action": ["s3:GetObject", "s3:PutObject", "s3:DeleteObject"],
"Resource": "arn:aws:s3:::officeconvert-prod/*"
}
]
}
```
**CORS**
Required only if uploads or downloads go **directly from a browser** to presigned URLs. Server-side clients (`curl`, the Go client) do not need CORS. Allow `PUT` and `GET` for your web origin on the bucket.
**IAM roles vs IAM users**
AWS recommends **roles** over long-lived **IAM user** access keys when the server runs on AWS compute (ECS, EC2, Lambda): a role grants **temporary** credentials that rotate automatically, with no static keys to store or leak.
For this project today, the server reads explicit `S3_ACCESS_KEY` and `S3_SECRET_KEY` via the MinIO SDK. That maps cleanly to:
| Where you run | Practical choice |
|---------------|------------------|
| Docker on a VPS, bare metal, or outside AWS | IAM **user** with the policy above; store keys in env or a secrets manager. Fine for a single service at low volume. |
| ECS / EC2 / EKS on AWS | Prefer an IAM **role** attached to the task or instance. Your orchestrator injects short-lived credentials; you still pass them into `S3_ACCESS_KEY` / `S3_SECRET_KEY` (and a session token if your runtime provides one — the server does not yet read a dedicated `S3_SESSION_TOKEN` env var). |
## Conversion Tuning Notes
+19 -1
View File
@@ -14,16 +14,34 @@ services:
volumes:
- seaweedfs_data:/data
s3-init:
image: minio/mc:latest
depends_on:
- seaweedfs
environment:
AWS_ACCESS_KEY_ID: ${S3_ACCESS_KEY:-minioadmin}
AWS_SECRET_ACCESS_KEY: ${S3_SECRET_KEY:-minioadmin}
S3_BUCKET: ${S3_BUCKET:-officeconvert}
entrypoint: >
/bin/sh -c "
until mc alias set local http://seaweedfs:8333 $$AWS_ACCESS_KEY_ID $$AWS_SECRET_ACCESS_KEY; do sleep 1; done &&
mc mb local/$$S3_BUCKET --ignore-existing
"
server:
build:
context: ..
dockerfile: Dockerfile.server
depends_on:
- seaweedfs
s3-init:
condition: service_completed_successfully
environment:
S3_ENDPOINT: ${S3_ENDPOINT:-seaweedfs:8333}
S3_PUBLIC_ENDPOINT: ${S3_PUBLIC_ENDPOINT:-localhost:8333}
S3_BUCKET: ${S3_BUCKET:-officeconvert}
S3_REGION: ${S3_REGION:-}
S3_USE_SSL: ${S3_USE_SSL:-false}
S3_PUBLIC_USE_SSL: ${S3_PUBLIC_USE_SSL:-}
S3_ACCESS_KEY: ${S3_ACCESS_KEY:-minioadmin}
S3_SECRET_KEY: ${S3_SECRET_KEY:-minioadmin}
S3_SESSION_TTL_SECONDS: ${S3_SESSION_TTL_SECONDS:-3600}
+20 -2
View File
@@ -15,14 +15,32 @@ services:
volumes:
- seaweedfs_data:/data
server:
image: gitea.auvem.com/end/officeconvert-server:${OFFICECONVERT_IMAGE_TAG:-latest}
s3-init:
image: minio/mc:latest
depends_on:
- seaweedfs
environment:
AWS_ACCESS_KEY_ID: ${S3_ACCESS_KEY:-minioadmin}
AWS_SECRET_ACCESS_KEY: ${S3_SECRET_KEY:-minioadmin}
S3_BUCKET: ${S3_BUCKET:-officeconvert}
entrypoint: >
/bin/sh -c "
until mc alias set local http://seaweedfs:8333 $$AWS_ACCESS_KEY_ID $$AWS_SECRET_ACCESS_KEY; do sleep 1; done &&
mc mb local/$$S3_BUCKET --ignore-existing
"
server:
image: gitea.auvem.com/end/officeconvert-server:${OFFICECONVERT_IMAGE_TAG:-latest}
depends_on:
s3-init:
condition: service_completed_successfully
environment:
S3_ENDPOINT: ${S3_ENDPOINT:-seaweedfs:8333}
S3_PUBLIC_ENDPOINT: ${S3_PUBLIC_ENDPOINT:-localhost:8333}
S3_BUCKET: ${S3_BUCKET:-officeconvert}
S3_REGION: ${S3_REGION:-}
S3_USE_SSL: ${S3_USE_SSL:-false}
S3_PUBLIC_USE_SSL: ${S3_PUBLIC_USE_SSL:-}
S3_ACCESS_KEY: ${S3_ACCESS_KEY:-minioadmin}
S3_SECRET_KEY: ${S3_SECRET_KEY:-minioadmin}
S3_SESSION_TTL_SECONDS: ${S3_SESSION_TTL_SECONDS:-3600}
+3 -12
View File
@@ -777,7 +777,6 @@ type CreateConversionResponse struct {
state protoimpl.MessageState `protogen:"open.v1"`
// Session identifier: KSUID in standard base62 text form. Well-formed values are at most 27 characters (see https://github.com/segmentio/ksuid).
ConversionId string `protobuf:"bytes,1,opt,name=conversion_id,json=conversionId,proto3" json:"conversion_id,omitempty"`
UploadBucket string `protobuf:"bytes,2,opt,name=upload_bucket,json=uploadBucket,proto3" json:"upload_bucket,omitempty"`
UploadObjectKey string `protobuf:"bytes,3,opt,name=upload_object_key,json=uploadObjectKey,proto3" json:"upload_object_key,omitempty"`
UploadUrl string `protobuf:"bytes,4,opt,name=upload_url,json=uploadUrl,proto3" json:"upload_url,omitempty"`
ExpiresAt *timestamppb.Timestamp `protobuf:"bytes,5,opt,name=expires_at,json=expiresAt,proto3" json:"expires_at,omitempty"`
@@ -822,13 +821,6 @@ func (x *CreateConversionResponse) GetConversionId() string {
return ""
}
func (x *CreateConversionResponse) GetUploadBucket() string {
if x != nil {
return x.UploadBucket
}
return ""
}
func (x *CreateConversionResponse) GetUploadObjectKey() string {
if x != nil {
return x.UploadObjectKey
@@ -1330,15 +1322,14 @@ const file_officeconvertapi_v1_conversion_proto_rawDesc = "" +
"\x0fsource_filename\x18\x01 \x01(\tR\x0esourceFilename\x12;\n" +
"\x04full\x18\x02 \x01(\v2'.officeconvertapi.v1.SlideRasterOptionsR\x04full\x12E\n" +
"\tthumbnail\x18\x03 \x01(\v2'.officeconvertapi.v1.SlideRasterOptionsR\tthumbnail\x127\n" +
"\x05notes\x18\x04 \x01(\v2!.officeconvertapi.v1.NotesOptionsR\x05notes\"\xea\x01\n" +
"\x05notes\x18\x04 \x01(\v2!.officeconvertapi.v1.NotesOptionsR\x05notes\"\xda\x01\n" +
"\x18CreateConversionResponse\x12#\n" +
"\rconversion_id\x18\x01 \x01(\tR\fconversionId\x12#\n" +
"\rupload_bucket\x18\x02 \x01(\tR\fuploadBucket\x12*\n" +
"\rconversion_id\x18\x01 \x01(\tR\fconversionId\x12*\n" +
"\x11upload_object_key\x18\x03 \x01(\tR\x0fuploadObjectKey\x12\x1d\n" +
"\n" +
"upload_url\x18\x04 \x01(\tR\tuploadUrl\x129\n" +
"\n" +
"expires_at\x18\x05 \x01(\v2\x1a.google.protobuf.TimestampR\texpiresAt\"=\n" +
"expires_at\x18\x05 \x01(\v2\x1a.google.protobuf.TimestampR\texpiresAtJ\x04\b\x02\x10\x03R\rupload_bucket\"=\n" +
"\x16StartConversionRequest\x12#\n" +
"\rconversion_id\x18\x01 \x01(\tR\fconversionId\"}\n" +
"\x17StartConversionResponse\x12#\n" +
File diff suppressed because one or more lines are too long
@@ -149,18 +149,16 @@ class CreateConversionRequest(_message.Message):
def __init__(self, source_filename: _Optional[str] = ..., full: _Optional[_Union[SlideRasterOptions, _Mapping]] = ..., thumbnail: _Optional[_Union[SlideRasterOptions, _Mapping]] = ..., notes: _Optional[_Union[NotesOptions, _Mapping]] = ...) -> None: ...
class CreateConversionResponse(_message.Message):
__slots__ = ("conversion_id", "upload_bucket", "upload_object_key", "upload_url", "expires_at")
__slots__ = ("conversion_id", "upload_object_key", "upload_url", "expires_at")
CONVERSION_ID_FIELD_NUMBER: _ClassVar[int]
UPLOAD_BUCKET_FIELD_NUMBER: _ClassVar[int]
UPLOAD_OBJECT_KEY_FIELD_NUMBER: _ClassVar[int]
UPLOAD_URL_FIELD_NUMBER: _ClassVar[int]
EXPIRES_AT_FIELD_NUMBER: _ClassVar[int]
conversion_id: str
upload_bucket: str
upload_object_key: str
upload_url: str
expires_at: _timestamp_pb2.Timestamp
def __init__(self, conversion_id: _Optional[str] = ..., upload_bucket: _Optional[str] = ..., upload_object_key: _Optional[str] = ..., upload_url: _Optional[str] = ..., expires_at: _Optional[_Union[datetime.datetime, _timestamp_pb2.Timestamp, _Mapping]] = ...) -> None: ...
def __init__(self, conversion_id: _Optional[str] = ..., upload_object_key: _Optional[str] = ..., upload_url: _Optional[str] = ..., expires_at: _Optional[_Union[datetime.datetime, _timestamp_pb2.Timestamp, _Mapping]] = ...) -> None: ...
class StartConversionRequest(_message.Message):
__slots__ = ("conversion_id",)
+2 -1
View File
@@ -136,7 +136,8 @@ message CreateConversionRequest {
message CreateConversionResponse {
// Session identifier: KSUID in standard base62 text form. Well-formed values are at most 27 characters (see https://github.com/segmentio/ksuid).
string conversion_id = 1;
string upload_bucket = 2;
reserved 2;
reserved "upload_bucket";
string upload_object_key = 3;
string upload_url = 4;
google.protobuf.Timestamp expires_at = 5;
+1
View File
@@ -8,6 +8,7 @@ dependencies = [
"connectrpc>=0.10.0",
"minio>=7.2.18",
"officeconvert",
"protobuf>=7.35.0",
"svix-ksuid>=0.7.0",
"uvicorn>=0.35.0",
]
@@ -10,7 +10,9 @@ from officeconvertapi.v1.conversion_connect import ConversionServiceASGIApplicat
from officeconvert_server.config import load_server_config
from officeconvert_server.service import ConversionServiceImpl
from officeconvert_server.storage import S3Store
from officeconvert_server.storage import S3Store, log_s3_error
from minio.error import S3Error
logger = logging.getLogger(__name__)
@@ -44,6 +46,7 @@ def create_app() -> ConversionServiceASGIApplication:
_configure_application_logging()
config = load_server_config()
store = S3Store(
bucket=config.s3_bucket,
endpoint=config.s3_endpoint,
access_key=config.s3_access_key,
secret_key=config.s3_secret_key,
@@ -55,6 +58,16 @@ def create_app() -> ConversionServiceASGIApplication:
if os.getenv("OFFICECONVERT_S3_TRACE", "").lower() in ("1", "true", "yes"):
store.enable_http_trace(sys.stderr)
logger.warning("OFFICECONVERT_S3_TRACE enabled: S3 HTTP dumps on stderr")
try:
store.require_bucket()
except S3Error as exc:
log_s3_error(
"require_bucket",
endpoint=config.s3_endpoint,
secure=config.s3_secure,
exc=exc,
)
raise
service = ConversionServiceImpl(config=config, store=store)
return ConversionServiceASGIApplication(service)
@@ -10,6 +10,7 @@ import os
class ServerConfig:
"""Defines environment-driven settings for server orchestration."""
s3_bucket: str
s3_endpoint: str
s3_access_key: str
s3_secret_key: str
@@ -30,14 +31,18 @@ class ServerConfig:
def load_server_config() -> ServerConfig:
"""Load server configuration from environment variables."""
s3_secure = os.getenv("S3_USE_SSL", "false").lower() == "true"
public_ssl_env = os.getenv("S3_PUBLIC_USE_SSL")
public_ssl_env = os.getenv("S3_PUBLIC_USE_SSL", "").strip()
s3_public_secure = (
public_ssl_env.lower() == "true"
if public_ssl_env is not None
if public_ssl_env
else s3_secure
)
region_env = os.getenv("S3_REGION", "").strip()
s3_bucket = os.getenv("S3_BUCKET", "").strip()
if not s3_bucket:
raise ValueError("S3_BUCKET is required")
return ServerConfig(
s3_bucket=s3_bucket,
s3_endpoint=os.getenv("S3_ENDPOINT", "localhost:8333"),
s3_access_key=os.getenv("S3_ACCESS_KEY", "minioadmin"),
s3_secret_key=os.getenv("S3_SECRET_KEY", "minioadmin"),
@@ -23,7 +23,7 @@ class ConversionSession:
thumbnail_resolution: conversion_pb2.ConversionResolution
full_jpeg_quality: int
thumbnail_jpeg_quality: int
bucket_name: str
object_prefix: str
upload_object_key: str
status: conversion_pb2.ConversionStatus
notes: conversion_pb2.NotesOptions | None = None
@@ -122,23 +122,12 @@ class ConversionServiceImpl(conversion_connect.ConversionService):
ksuid = Ksuid()
conversion_id = str(ksuid)
bucket_name = f"oc-{bytes(ksuid).hex()}"
upload_key = "input/source.pptx"
object_prefix = f"{conversion_id}/"
upload_key = f"{object_prefix}input/source.pptx"
expires_at = utc_now() + timedelta(seconds=self._config.s3_session_ttl_seconds)
try:
self._store.ensure_bucket(bucket_name)
except S3Error as exc:
log_s3_error(
"ensure_bucket",
endpoint=self._config.s3_endpoint,
secure=self._config.s3_secure,
exc=exc,
)
raise
try:
upload_url = self._store.presigned_put_url(
bucket_name,
upload_key,
ttl_seconds=self._config.s3_session_ttl_seconds,
)
@@ -159,7 +148,7 @@ class ConversionServiceImpl(conversion_connect.ConversionService):
full_jpeg_quality=full_jpeg_quality,
thumbnail_jpeg_quality=thumbnail_jpeg_quality,
notes=request.notes if request.HasField("notes") else None,
bucket_name=bucket_name,
object_prefix=object_prefix,
upload_object_key=upload_key,
status=conversion_pb2.CONVERSION_STATUS_PENDING,
)
@@ -168,7 +157,6 @@ class ConversionServiceImpl(conversion_connect.ConversionService):
return conversion_pb2.CreateConversionResponse(
conversion_id=conversion_id,
upload_bucket=bucket_name,
upload_object_key=upload_key,
upload_url=upload_url,
expires_at=_to_timestamp(expires_at),
@@ -265,7 +253,10 @@ class ConversionServiceImpl(conversion_connect.ConversionService):
if session.conversion_task is not None and not session.conversion_task.done():
session.conversion_task.cancel()
await self._cleanup_local_artifacts(session)
await asyncio.to_thread(self._store.remove_bucket_tree, session.bucket_name)
await asyncio.to_thread(
self._store.remove_prefix,
session.object_prefix,
)
return conversion_pb2.DeleteConversionResponse(
conversion_id=session.conversion_id,
deleted=True,
@@ -295,7 +286,6 @@ class ConversionServiceImpl(conversion_connect.ConversionService):
try:
await asyncio.to_thread(
self._store.fget_object,
session.bucket_name,
session.upload_object_key,
source_path,
)
@@ -436,10 +426,11 @@ class ConversionServiceImpl(conversion_connect.ConversionService):
upload_total = slide_total * 2
upload_index = 0
for slide in slides:
object_key = f"output/slide-{slide.index:04d}{slide.image_path.suffix}"
self._store.fput_object(session.bucket_name, object_key, slide.image_path)
object_key = (
f"{session.object_prefix}output/slide-{slide.index:04d}{slide.image_path.suffix}"
)
self._store.fput_object(object_key, slide.image_path)
image_url = self._store.presigned_get_url(
session.bucket_name,
object_key,
ttl_seconds=self._config.s3_session_ttl_seconds,
)
@@ -447,15 +438,14 @@ class ConversionServiceImpl(conversion_connect.ConversionService):
if progress_callback is not None:
progress_callback(upload_index, upload_total)
thumbnail_object_key = (
f"output/thumb/slide-{slide.index:04d}{slide.thumbnail_path.suffix}"
f"{session.object_prefix}output/thumb/slide-{slide.index:04d}"
f"{slide.thumbnail_path.suffix}"
)
self._store.fput_object(
session.bucket_name,
thumbnail_object_key,
slide.thumbnail_path,
)
thumbnail_image_url = self._store.presigned_get_url(
session.bucket_name,
thumbnail_object_key,
ttl_seconds=self._config.s3_session_ttl_seconds,
)
@@ -515,7 +505,10 @@ class ConversionServiceImpl(conversion_connect.ConversionService):
"""Delete storage resources after the configured session retention period."""
try:
await asyncio.sleep(self._config.conversion_cleanup_delay_seconds)
await asyncio.to_thread(self._store.remove_bucket_tree, session.bucket_name)
await asyncio.to_thread(
self._store.remove_prefix,
session.object_prefix,
)
except asyncio.CancelledError:
return
finally:
@@ -6,7 +6,6 @@ import logging
from datetime import timedelta
from pathlib import Path
from typing import TextIO
from urllib.parse import urlparse
from minio import Minio
from minio.deleteobjects import DeleteObject
@@ -54,6 +53,7 @@ class S3Store:
def __init__(
self,
*,
bucket: str,
endpoint: str,
access_key: str,
secret_key: str,
@@ -63,6 +63,7 @@ class S3Store:
public_secure: bool,
) -> None:
"""Initialize S3 clients for internal and public URL generation."""
self._bucket = bucket
self._client = Minio(
endpoint,
access_key=access_key,
@@ -83,79 +84,67 @@ class S3Store:
self._client.trace_on(stream)
self._public_client.trace_on(stream)
def ensure_bucket(self, bucket_name: str) -> None:
"""Create a bucket if it does not already exist.
def require_bucket(self) -> None:
"""Verify the configured bucket exists before serving traffic."""
if not self._client.bucket_exists(self._bucket):
raise RuntimeError(
f"S3 bucket {self._bucket!r} does not exist; create it before starting the server"
)
Uses CreateBucket only, not HeadBucket. Some S3-compatible stores
(including SeaweedFS) mishandle or over-restrict HeadBucket; the MinIO
client's bucket_exists() maps non-NoSuchBucket errors to failures.
Idempotent create covers the same contract with fewer round trips.
"""
try:
self._client.make_bucket(bucket_name)
except S3Error as exc:
if exc.code in ("BucketAlreadyOwnedByYou", "BucketAlreadyExists"):
return
raise
def presigned_put_url(self, bucket_name: str, object_key: str, *, ttl_seconds: int) -> str:
def presigned_put_url(self, object_key: str, *, ttl_seconds: int) -> str:
"""Generate a presigned PUT URL for a single object upload."""
return self._public_client.presigned_put_object(
bucket_name,
self._bucket,
object_key,
expires=timedelta(seconds=ttl_seconds),
)
def presigned_get_url(self, bucket_name: str, object_key: str, *, ttl_seconds: int) -> str:
def presigned_get_url(self, object_key: str, *, ttl_seconds: int) -> str:
"""Generate a presigned GET URL for downloading one object."""
return self._public_client.presigned_get_object(
bucket_name,
self._bucket,
object_key,
expires=timedelta(seconds=ttl_seconds),
)
def fget_object(self, bucket_name: str, object_key: str, output_path: Path) -> None:
def fget_object(self, object_key: str, output_path: Path) -> None:
"""Download one object from storage to a local filesystem path."""
output_path.parent.mkdir(parents=True, exist_ok=True)
self._client.fget_object(bucket_name, object_key, str(output_path))
self._client.fget_object(self._bucket, object_key, str(output_path))
def fput_object(self, bucket_name: str, object_key: str, source_path: Path) -> None:
def fput_object(self, object_key: str, source_path: Path) -> None:
"""Upload one local filesystem object to storage."""
self._client.fput_object(bucket_name, object_key, str(source_path))
self._client.fput_object(self._bucket, object_key, str(source_path))
def remove_bucket_tree(self, bucket_name: str) -> None:
"""Remove all objects in a bucket and then delete the bucket."""
objects = list(self._client.list_objects(bucket_name, recursive=True))
if objects:
delete_requests: list[DeleteObject] = []
for obj in objects:
object_name = obj.object_name
if object_name is None:
raise RuntimeError(
"encountered unnamed object while removing bucket contents"
)
delete_requests.append(DeleteObject(object_name))
errors = self._client.remove_objects(
bucket_name,
delete_requests,
def remove_prefix(self, prefix: str) -> None:
"""Remove all objects under a key prefix within the configured bucket."""
normalized_prefix = prefix if prefix.endswith("/") else f"{prefix}/"
objects = list(
self._client.list_objects(
self._bucket,
prefix=normalized_prefix,
recursive=True,
)
for err in errors:
object_name = err.name or "<unknown>"
message = err.message or err.code
)
if not objects:
return
delete_requests: list[DeleteObject] = []
for obj in objects:
object_name = obj.object_name
if object_name is None:
raise RuntimeError(
f"failed to delete object {object_name}: {message}"
"encountered unnamed object while removing prefix contents"
)
try:
self._client.remove_bucket(bucket_name)
except S3Error as exc:
# Concurrent cleanup paths may race to remove the same bucket.
if exc.code != "NoSuchBucket":
raise
delete_requests.append(DeleteObject(object_name))
def object_key_from_presigned_url(url: str) -> str:
"""Extract object key from a presigned URL path for diagnostics."""
path = urlparse(url).path
path_parts = [part for part in path.split("/") if part]
return "/".join(path_parts[1:]) if len(path_parts) >= 2 else ""
errors = self._client.remove_objects(
self._bucket,
delete_requests,
)
for err in errors:
object_name = err.name or "<unknown>"
message = err.message or err.code
raise RuntimeError(
f"failed to delete object {object_name}: {message}"
)
+11 -9
View File
@@ -292,6 +292,7 @@ dependencies = [
{ name = "connectrpc" },
{ name = "minio" },
{ name = "officeconvert" },
{ name = "protobuf" },
{ name = "svix-ksuid" },
{ name = "uvicorn" },
]
@@ -301,6 +302,7 @@ requires-dist = [
{ name = "connectrpc", specifier = ">=0.10.0" },
{ name = "minio", specifier = ">=7.2.18" },
{ name = "officeconvert", editable = "packages/officeconvert" },
{ name = "protobuf", specifier = ">=7.35.0" },
{ name = "svix-ksuid", specifier = ">=0.7.0" },
{ name = "uvicorn", specifier = ">=0.35.0" },
]
@@ -394,17 +396,17 @@ wheels = [
[[package]]
name = "protobuf"
version = "7.34.1"
version = "7.35.1"
source = { registry = "https://pypi.org/simple" }
sdist = { url = "https://files.pythonhosted.org/packages/6b/6b/a0e95cad1ad7cc3f2c6821fcab91671bd5b78bd42afb357bb4765f29bc41/protobuf-7.34.1.tar.gz", hash = "sha256:9ce42245e704cc5027be797c1db1eb93184d44d1cdd71811fb2d9b25ad541280", size = 454708, upload-time = "2026-03-20T17:34:47.036Z" }
sdist = { url = "https://files.pythonhosted.org/packages/da/01/9ef0afd7999eb9badb3a768b4aedd78c86d4c65cfaf1958ab276199e76b4/protobuf-7.35.1.tar.gz", hash = "sha256:ce115a26fe0c39a2c29973d914d327e516a6455464489fe3cd1e51a1b354f81a", size = 458717, upload-time = "2026-06-11T21:55:40.257Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/ec/11/3325d41e6ee15bf1125654301211247b042563bcc898784351252549a8ad/protobuf-7.34.1-cp310-abi3-macosx_10_9_universal2.whl", hash = "sha256:d8b2cc79c4d8f62b293ad9b11ec3aebce9af481fa73e64556969f7345ebf9fc7", size = 429247, upload-time = "2026-03-20T17:34:37.024Z" },
{ url = "https://files.pythonhosted.org/packages/eb/9d/aa69df2724ff63efa6f72307b483ce0827f4347cc6d6df24b59e26659fef/protobuf-7.34.1-cp310-abi3-manylinux2014_aarch64.whl", hash = "sha256:5185e0e948d07abe94bb76ec9b8416b604cfe5da6f871d67aad30cbf24c3110b", size = 325753, upload-time = "2026-03-20T17:34:38.751Z" },
{ url = "https://files.pythonhosted.org/packages/92/e8/d174c91fd48e50101943f042b09af9029064810b734e4160bbe282fa1caa/protobuf-7.34.1-cp310-abi3-manylinux2014_s390x.whl", hash = "sha256:403b093a6e28a960372b44e5eb081775c9b056e816a8029c61231743d63f881a", size = 340198, upload-time = "2026-03-20T17:34:39.871Z" },
{ url = "https://files.pythonhosted.org/packages/53/1b/3b431694a4dc6d37b9f653f0c64b0a0d9ec074ee810710c0c3da21d67ba7/protobuf-7.34.1-cp310-abi3-manylinux2014_x86_64.whl", hash = "sha256:8ff40ce8cd688f7265326b38d5a1bed9bfdf5e6723d49961432f83e21d5713e4", size = 324267, upload-time = "2026-03-20T17:34:41.1Z" },
{ url = "https://files.pythonhosted.org/packages/85/29/64de04a0ac142fb685fd09999bc3d337943fb386f3a0ec57f92fd8203f97/protobuf-7.34.1-cp310-abi3-win32.whl", hash = "sha256:34b84ce27680df7cca9f231043ada0daa55d0c44a2ddfaa58ec1d0d89d8bf60a", size = 426628, upload-time = "2026-03-20T17:34:42.536Z" },
{ url = "https://files.pythonhosted.org/packages/4d/87/cb5e585192a22b8bd457df5a2c16a75ea0db9674c3a0a39fc9347d84e075/protobuf-7.34.1-cp310-abi3-win_amd64.whl", hash = "sha256:e97b55646e6ce5cbb0954a8c28cd39a5869b59090dfaa7df4598a7fba869468c", size = 437901, upload-time = "2026-03-20T17:34:44.112Z" },
{ url = "https://files.pythonhosted.org/packages/88/95/608f665226bca68b736b79e457fded9a2a38c4f4379a4a7614303d9db3bc/protobuf-7.34.1-py3-none-any.whl", hash = "sha256:bb3812cd53aefea2b028ef42bd780f5b96407247f20c6ef7c679807e9d188f11", size = 170715, upload-time = "2026-03-20T17:34:45.384Z" },
{ url = "https://files.pythonhosted.org/packages/10/03/8aeeb7458d22546bf64b5250ca1daeb5ff757d900e8e4a7476c6f0db843e/protobuf-7.35.1-cp310-abi3-macosx_10_9_universal2.whl", hash = "sha256:24f857477359a85c0c235261b8ba905fd51b2562f4a64ca1df5473f29850cbf6", size = 433226, upload-time = "2026-06-11T21:55:31.719Z" },
{ url = "https://files.pythonhosted.org/packages/37/4b/dfb89eb0e652a1ff073c39a59fb5e3a83cfe9b57a2c83fa6d78270101767/protobuf-7.35.1-cp310-abi3-manylinux2014_aarch64.whl", hash = "sha256:11d6b0ec246892d85215b0a13ca6e0233cf5284b68f0ac02646427f4ff88a799", size = 328847, upload-time = "2026-06-11T21:55:34.035Z" },
{ url = "https://files.pythonhosted.org/packages/0f/58/dc12f2cd484951524af6e3382c785869b9b3fb5e52ee95ae23add53ee8f9/protobuf-7.35.1-cp310-abi3-manylinux2014_s390x.whl", hash = "sha256:b73f9489a4b8b1c9cb1f8ed951c736392592edb24b9d6819f36d2e10b171d5b4", size = 344030, upload-time = "2026-06-11T21:55:34.941Z" },
{ url = "https://files.pythonhosted.org/packages/e4/be/5b3cfe508bfab6761414ff944e3366eb13be4fd71efcd69450f89ba39f43/protobuf-7.35.1-cp310-abi3-manylinux2014_x86_64.whl", hash = "sha256:74758715c53d7158fb76caf4f0cfdacc5329a4b1bb994f865d6cf302d413a1c4", size = 327130, upload-time = "2026-06-11T21:55:35.921Z" },
{ url = "https://files.pythonhosted.org/packages/d8/bc/6d6c7ba8709c85f8f2c390b2b118d6fb08a783676a572271851bf45a7d22/protobuf-7.35.1-cp310-abi3-win32.whl", hash = "sha256:353652e4efd0bca5b5fc2656abf8307ef351f0cf938c9eba09f0e09c20a25c30", size = 428945, upload-time = "2026-06-11T21:55:37.034Z" },
{ url = "https://files.pythonhosted.org/packages/0a/19/8d0cb6f20a1ef7b18f1c8986ad5783f22f84cce39c6ce9a6e645ea55192e/protobuf-7.35.1-cp310-abi3-win_amd64.whl", hash = "sha256:230a75ddfc2de4806e56696ce9640c1cdfdb6543b7cfce98d42a4c0a0e7bdb87", size = 439996, upload-time = "2026-06-11T21:55:38.123Z" },
{ url = "https://files.pythonhosted.org/packages/19/c7/5f7c636ec43e0c545e28d1f1db71990108306f7bdcb89f069ba97e428e7f/protobuf-7.35.1-py3-none-any.whl", hash = "sha256:4bc97768d8fe4ad6743c8a19403e314511ed9f6d13205b687e52421c023ac1b9", size = 171659, upload-time = "2026-06-11T21:55:39.155Z" },
]
[[package]]