end 3e8e6bd543
Docker server image / build-and-push (push) Successful in 3m20s
attempt to harden ensure_bucket reliability
2026-03-27 17:53:32 -07:00
2026-03-27 14:05:27 -07:00
2026-03-27 14:05:27 -07:00
2026-03-26 14:01:10 -07:00
2026-03-26 14:01:10 -07:00
2026-03-27 14:05:27 -07:00
2026-03-26 14:01:10 -07:00

officeconvert

officeconvert is a multimodule conversion toolkit for turning presentation files into typed SlideDeck artifacts with rendered slide images and notes. The repository is organized around Protocol Buffer schemas with ConnectRPC code generation for both server and client compatibility.

Modules

  • proto/ contains protobuf schemas and RPC definitions.
  • gen/python and gen/go contain generated protocol and Connect code.
  • python/packages/officeconvert is the core conversion library (PPTX -> PDF -> images + notes).
  • python/packages/server is the ConnectRPC Python server with SeaweedFS (S3-compatible) orchestration.
  • clients/go is the first client library with layered orchestration helpers.
  • deploy/ contains production-ish and dev Docker Compose files.

Supported Document Types

MVP currently supports PPTX only and produces a SlideDeck result containing:

  • ordered slide image URLs
  • plain-text notes per slide

Quick Commands

Use the root Makefile:

  • make buf-lint to lint protobufs
  • make buf-generate to regenerate Go and Python types
  • make py-sync to sync Python workspace dependencies with uv
  • make go-test to run Go client tests
  • make compose-up to run server + SeaweedFS
  • make compose-up-dev to run SeaweedFS only
  • make run-server to start host uvicorn with .env (if present) plus defaults

Development Server Workflow

This is the recommended local workflow for iterating on the Python server and conversion library while keeping SeaweedFS in Docker.

1) Prerequisites

  • buf on your PATH
  • uv on your PATH
  • Docker + Docker Compose
  • Local tools if running server on host (not in container):
    • LibreOffice (soffice)
    • Poppler (pdftoppm)

2) Generate typed API code

From repo root:

make buf-lint
make buf-generate

3) Sync Python workspace dependencies

From repo root:

make py-sync

4) Start SeaweedFS dependency stack (dev compose)

From repo root:

make compose-up-dev

SeaweedFS endpoints:

  • S3 API: http://localhost:8333
  • Master API: http://localhost:9333
  • Filer API: http://localhost:8888
  • Default S3 creds: minioadmin / minioadmin

5) Start Connect server (host process)

In a separate terminal, from repo root:

make run-server

make run-server behavior:

  • loads .env automatically if present
  • applies reasonable defaults when values are not set
  • defaults S3 endpoint to localhost:8333 for host-based development
  • auto-normalizes seaweedfs:8333 to localhost:8333 for host runs
  • supports optional UVICORN_HOST and UVICORN_PORT overrides
  • exposes conversion timeout tuning vars (CONVERSION_PPTX_TO_PDF_TIMEOUT_SECONDS, CONVERSION_PDF_TO_IMAGES_TIMEOUT_SECONDS)

Server endpoint base URL:

  • http://localhost:8080

6) Quick smoke test

Create a conversion request:

curl \
  --header "Content-Type: application/json" \
  --data '{"sourceFilename":"example.pptx","resolution":"CONVERSION_RESOLUTION_FHD"}' \
  http://localhost:8080/officeconvertapi.v1.ConversionService/CreateConversion

Then:

  1. Upload the PPTX to the returned uploadUrl using HTTP PUT.
  2. Call StartConversion with the returned conversionId.
  3. Poll GetConversionStatus until CONVERSION_STATUS_SUCCEEDED.
  4. Call GetSlideDeck and download each imageUrl.
  5. Optionally call DeleteConversion for early cleanup.

7) Full container workflow (optional)

If you want to run both server and SeaweedFS in Docker:

make compose-up

Use .env.example as your baseline env configuration.

Storage Backend Notes

  • This project defaults to SeaweedFS S3 API for object transit in development and compose deployments.
  • The Python server uses the minio Python SDK, which is intentional because SeaweedFS is S3-compatible.
  • Runtime configuration uses S3_* environment variables.

Conversion Tuning Notes

If conversion fails on larger decks, tune these environment variables:

  • CreateConversionRequest.resolution controls output dimensions via presets: SD, HD, FHD, QHD, UHD.
  • Omitting resolution (or sending CONVERSION_RESOLUTION_UNSPECIFIED) defaults to FHD.
  • Rasterization DPI is inferred automatically from source slide size and selected output dimensions.
  • CONVERSION_PPTX_TO_PDF_TIMEOUT_SECONDS (default 180): timeout for LibreOffice export.
  • CONVERSION_PDF_TO_IMAGES_TIMEOUT_SECONDS (default 1800): timeout for Poppler rasterization.
S
Description
Microservice to convert common office document files to server consumable formats.
Readme 489 KiB
Languages
Python 87.1%
Go 10.3%
Makefile 2.6%