4.9 KiB
officeconvert
officeconvert is a multimodule conversion toolkit for turning presentation files into
typed SlideDeck artifacts with rendered slide images and notes. The repository is
organized around Protocol Buffer schemas with ConnectRPC code generation for both server
and client compatibility.
Modules
proto/contains protobuf schemas and RPC definitions.gen/pythonandgen/gocontain generated protocol and Connect code.python/packages/officeconvertis the core conversion library (PPTX -> PDF -> images + notes).python/packages/serveris the ConnectRPC Python server with SeaweedFS (S3-compatible) orchestration.clients/gois the first client library with layered orchestration helpers.deploy/contains production-ish and dev Docker Compose files.
Supported Document Types
MVP currently supports PPTX only and produces a SlideDeck result containing:
- ordered slide image URLs
- plain-text notes per slide
Quick Commands
Use the root Makefile:
make buf-lintto lint protobufsmake buf-generateto regenerate Go and Python typesmake py-syncto sync Python workspace dependencies with uvmake go-testto run Go client testsmake compose-upto run server + SeaweedFSmake compose-up-devto run SeaweedFS onlymake run-serverto start hostuvicornwith.env(if present) plus defaults
Development Server Workflow
This is the recommended local workflow for iterating on the Python server and conversion library while keeping SeaweedFS in Docker.
1) Prerequisites
bufon yourPATHuvon yourPATH- Docker + Docker Compose
- Local tools if running server on host (not in container):
- LibreOffice (
soffice) - Poppler (
pdftoppm)
- LibreOffice (
2) Generate typed API code
From repo root:
make buf-lint
make buf-generate
3) Sync Python workspace dependencies
From repo root:
make py-sync
4) Start SeaweedFS dependency stack (dev compose)
From repo root:
make compose-up-dev
SeaweedFS endpoints:
- S3 API:
http://localhost:8333 - Master API:
http://localhost:9333 - Filer API:
http://localhost:8888 - Default S3 creds:
minioadmin/minioadmin
5) Start Connect server (host process)
In a separate terminal, from repo root:
make run-server
make run-server behavior:
- loads
.envautomatically if present - applies reasonable defaults when values are not set
- defaults S3 endpoint to
localhost:8333for host-based development - auto-normalizes
seaweedfs:8333tolocalhost:8333for host runs - supports optional
UVICORN_HOSTandUVICORN_PORToverrides - exposes conversion timeout tuning vars (
CONVERSION_PPTX_TO_PDF_TIMEOUT_SECONDS,CONVERSION_PDF_TO_IMAGES_TIMEOUT_SECONDS)
Server endpoint base URL:
http://localhost:8080
6) Quick smoke test
Create a conversion request:
curl \
--header "Content-Type: application/json" \
--data '{
"sourceFilename":"example.pptx",
"full":{"resolution":"CONVERSION_RESOLUTION_FHD","jpeg":{"quality":85}},
"thumbnail":{"resolution":"CONVERSION_RESOLUTION_SD","jpeg":{"quality":75}}
}' \
http://localhost:8080/officeconvertapi.v1.ConversionService/CreateConversion
Then:
- Upload the PPTX to the returned
uploadUrlusing HTTPPUT. - Call
StartConversionwith the returnedconversionId. - Poll
GetConversionStatusuntilCONVERSION_STATUS_SUCCEEDED. - Call
GetSlideDeckand download eachimageUrl. - Optionally call
DeleteConversionfor early cleanup.
7) Full container workflow (optional)
If you want to run both server and SeaweedFS in Docker:
make compose-up
Use .env.example as your baseline env configuration.
Storage Backend Notes
- This project defaults to SeaweedFS S3 API for object transit in development and compose deployments.
- The Python server uses the
minioPython SDK, which is intentional because SeaweedFS is S3-compatible. - Runtime configuration uses
S3_*environment variables.
Conversion Tuning Notes
If conversion fails on larger decks, tune these environment variables:
CreateConversionRequest.full.resolutioncontrols full-size output dimensions via presets:SD,HD,FHD,QHD,UHD.CreateConversionRequest.thumbnail.resolutioncontrols thumbnail output dimensions with the same presets.- Omitting full/thumbnail resolution (or sending
CONVERSION_RESOLUTION_UNSPECIFIED) defaults toFHDfor full andSDfor thumbnail. - Output is JPEG-only for now; set
CreateConversionRequest.full.jpeg.qualityandCreateConversionRequest.thumbnail.jpeg.qualityto1..100(0or omitted uses server defaults: full85, thumbnail75). - Rasterization DPI is inferred automatically from source slide size and selected full/thumbnail output dimensions.
CONVERSION_PPTX_TO_PDF_TIMEOUT_SECONDS(default180): timeout for LibreOffice export.CONVERSION_PDF_TO_IMAGES_TIMEOUT_SECONDS(default1800): timeout for Poppler rasterization.