# officeconvert officeconvert is a multimodule conversion toolkit for turning presentation files into typed `SlideDeck` artifacts with rendered slide images and notes. The repository is organized around Protocol Buffer schemas with ConnectRPC code generation for both server and client compatibility. ## Modules - `proto/` contains protobuf schemas and RPC definitions. - `gen/python` and `gen/go` contain generated protocol and Connect code. - `python/packages/officeconvert` is the core conversion library (PPTX -> PDF -> images + notes). - `python/packages/server` is the ConnectRPC Python server with SeaweedFS (S3-compatible) orchestration. - `clients/go` is the first client library with layered orchestration helpers. - `deploy/` contains production-ish and dev Docker Compose files. ## Supported Document Types MVP currently supports **PPTX only** and produces a `SlideDeck` result containing: - ordered slide image URLs - plain-text notes per slide ## Quick Commands Use the root `Makefile`: - `make buf-lint` to lint protobufs - `make buf-generate` to regenerate Go and Python types - `make py-sync` to sync Python workspace dependencies with uv - `make go-test` to run Go client tests - `make compose-up` to run server + SeaweedFS - `make compose-up-dev` to run SeaweedFS only - `make run-server` to start host `uvicorn` with `.env` (if present) plus defaults ## Development Server Workflow This is the recommended local workflow for iterating on the Python server and conversion library while keeping SeaweedFS in Docker. ### 1) Prerequisites - `buf` on your `PATH` - `uv` on your `PATH` - Docker + Docker Compose - Local tools if running server on host (not in container): - LibreOffice (`soffice`) - Poppler (`pdftoppm`) ### 2) Generate typed API code From repo root: ```bash make buf-lint make buf-generate ``` ### 3) Sync Python workspace dependencies From repo root: ```bash make py-sync ``` ### 4) Start SeaweedFS dependency stack (dev compose) From repo root: ```bash make compose-up-dev ``` SeaweedFS endpoints: - S3 API: `http://localhost:8333` - Master API: `http://localhost:9333` - Filer API: `http://localhost:8888` - Default S3 creds: `minioadmin` / `minioadmin` ### 5) Start Connect server (host process) In a separate terminal, from repo root: ```bash make run-server ``` `make run-server` behavior: - loads `.env` automatically if present - applies reasonable defaults when values are not set - defaults S3 endpoint to `localhost:8333` for host-based development - auto-normalizes `seaweedfs:8333` to `localhost:8333` for host runs - supports optional `UVICORN_HOST` and `UVICORN_PORT` overrides - exposes conversion timeout tuning vars (`CONVERSION_PPTX_TO_PDF_TIMEOUT_SECONDS`, `CONVERSION_PDF_TO_IMAGES_TIMEOUT_SECONDS`) Server endpoint base URL: - `http://localhost:8080` ### 6) Quick smoke test Create a conversion request: ```bash curl \ --header "Content-Type: application/json" \ --data '{ "sourceFilename":"example.pptx", "full":{"resolution":"CONVERSION_RESOLUTION_FHD","jpeg":{"quality":85}}, "thumbnail":{"resolution":"CONVERSION_RESOLUTION_SD","jpeg":{"quality":75}} }' \ http://localhost:8080/officeconvertapi.v1.ConversionService/CreateConversion ``` Then: 1. Upload the PPTX to the returned `uploadUrl` using HTTP `PUT`. 2. Call `StartConversion` with the returned `conversionId`. 3. Poll `GetConversionStatus` until `CONVERSION_STATUS_SUCCEEDED`. 4. Call `GetSlideDeck` and download each `imageUrl`. 5. Optionally call `DeleteConversion` for early cleanup. ### 7) Full container workflow (optional) If you want to run both server and SeaweedFS in Docker: ```bash make compose-up ``` Use `.env.example` as your baseline env configuration. ## Storage Backend Notes - This project defaults to **SeaweedFS S3 API** for object transit in development and compose deployments. - The Python server uses the `minio` Python SDK, which is intentional because SeaweedFS is S3-compatible. - Runtime configuration uses `S3_*` environment variables. - All conversions share one bucket (`S3_BUCKET`, required). Each conversion's objects live under a `{conversion_id}/` key prefix (for example `{conversion_id}/input/source.pptx` and `{conversion_id}/output/slide-0001.jpg`). ## Conversion Tuning Notes If conversion fails on larger decks, tune these environment variables: - `CreateConversionRequest.full.resolution` controls full-size output dimensions via presets: `SD`, `HD`, `FHD`, `QHD`, `UHD`. - `CreateConversionRequest.thumbnail.resolution` controls thumbnail output dimensions with the same presets. - Omitting full/thumbnail resolution (or sending `CONVERSION_RESOLUTION_UNSPECIFIED`) defaults to `FHD` for full and `SD` for thumbnail. - Output is JPEG-only for now; set `CreateConversionRequest.full.jpeg.quality` and `CreateConversionRequest.thumbnail.jpeg.quality` to `1..100` (`0` or omitted uses server defaults: full `85`, thumbnail `75`). - Rasterization DPI is inferred automatically from source slide size and selected full/thumbnail output dimensions. - `CONVERSION_PPTX_TO_PDF_TIMEOUT_SECONDS` (default `180`): timeout for LibreOffice export. - `CONVERSION_PDF_TO_IMAGES_TIMEOUT_SECONDS` (default `1800`): timeout for Poppler rasterization.