k-l-lambda's picture
feat: add Python ML services (CPU mode) with model download
2b7aae2

STARRY Python ML Services

Unified ML inference services for STARRY music notation recognition system.

Architecture

This module provides lightweight wrappers around serialized ML models:

  • TorchScript (.pt) for PyTorch models (layout, mask, semantic, gauge, loc)
  • SavedModel for TensorFlow models (ocr, brackets)

Services

Service Port Framework Description
layout 12022 PyTorch Page layout detection (intervals, rotation)
gauge 12023 PyTorch Staff gauge prediction (height, slope)
mask 12024 PyTorch Staff foreground/background mask
semantic 12025 PyTorch Symbol semantic detection (77 classes)
loc 12026 PyTorch Text location detection (13 categories)
ocr 12027 TensorFlow Text recognition (DenseNet-CTC)
brackets 12028 TensorFlow Bracket sequence recognition

Installation

pip install -r requirements.txt

Model Export

Before using the services, models need to be exported from the original projects.

PyTorch Models (TorchScript)

Run in the deep-starry environment:

cd /home/camus/work/deep-starry
python /path/to/scripts/export_torchscript.py \
    --mode layout \
    --config configs/your-config \
    --output models/layout.pt

TensorFlow Models (SavedModel)

Run in the starry-ocr environment:

cd /home/camus/work/starry-ocr
python /path/to/scripts/export_tensorflow.py \
    --mode ocr \
    --config pretrained/OCR_Test/config.yaml \
    --output models/ocr_savedmodel

Usage

Start a Service

python main.py -m layout -w models/layout.pt -p 12022 -dv cuda

With Configuration File

python main.py -m semantic -w models/semantic.pt -p 12025 --config config/semantic.yaml

Client Example (Python)

import zmq
from msgpack import packb, unpackb

# Connect
ctx = zmq.Context()
sock = ctx.socket(zmq.REQ)
sock.connect("tcp://localhost:12022")

# Send request
with open('image.png', 'rb') as f:
    image_buffer = f.read()

sock.send(packb({
    'method': 'predict',
    'args': [[image_buffer]],
    'kwargs': {}
}))

# Receive response
result = unpackb(sock.recv())
print(result)

Directory Structure

python-services/
├── common/
│   ├── __init__.py
│   ├── zero_server.py      # ZeroMQ server
│   ├── image_utils.py      # Image processing utilities
│   └── transform.py        # Data transformation pipeline
├── predictors/
│   ├── __init__.py
│   ├── torchscript_predictor.py   # PyTorch loader
│   └── tensorflow_predictor.py    # TensorFlow loader
├── services/
│   ├── __init__.py
│   ├── layout_service.py
│   ├── mask_service.py
│   ├── semantic_service.py
│   ├── gauge_service.py
│   ├── loc_service.py
│   ├── ocr_service.py
│   └── brackets_service.py
├── config/
│   └── semantic.yaml       # Example configuration
├── scripts/
│   ├── export_torchscript.py
│   └── export_tensorflow.py
├── main.py                 # Unified entry point
├── requirements.txt
└── README.md

Docker Deployment

Prerequisites

  1. Docker with NVIDIA GPU support (nvidia-docker2 / nvidia-container-toolkit)
  2. User must be in the docker group:
    sudo usermod -aG docker $USER
    # Re-login to apply group membership
    

Quick Start

cd backend/python-services

# Build all-in-one image
docker build -f Dockerfile --target all-in-one -t starry-ml:latest ../../..

# Run layout service (example)
docker run --gpus all -p 12022:12022 \
  -v /path/to/models/starry-dist:/models/starry-dist:ro \
  -v /path/to/deep-starry:/app/deep-starry:ro \
  starry-ml:latest \
  python /app/deep-starry/streamPredictor.py \
  /models/starry-dist/20221125-scorelayout-1121-residue-u-d4-w64-d4-w64 \
  -p 12022 -dv cuda -m layout

Using Docker Compose

# Test single service
docker compose -f docker-compose.test.yml up layout

# Production: all services
docker compose up -d

Model Volumes

Mount the following directories to the container:

  • starry-dist/ - PyTorch model weights (layout, mask, semantic, gauge)
  • ocr-dist/ - TensorFlow/PyTorch weights (loc, ocr, brackets)

Build Targets

The Dockerfile provides multiple build targets:

Target Description Size
pytorch-services PyTorch only (layout, mask, semantic, gauge) ~5GB
tensorflow-services TensorFlow only (ocr, brackets) ~4GB
all-in-one Both frameworks (all services) ~9GB

Example Dockerfile (lightweight)

Protocol

Communication uses ZeroMQ REP/REQ pattern with MessagePack serialization.

Request Format

{
    'method': 'predict',
    'args': [[buffer1, buffer2, ...]],  # List of image byte buffers
    'kwargs': {}                         # Optional keyword arguments
}

Response Format

{
    'code': 0,           # 0 for success, -1 for error
    'msg': 'success',
    'data': [...]        # List of prediction results
}

Notes

  1. TorchScript Compatibility: Some dynamic operations may not be supported. Test models after export.

  2. Preprocessing Consistency: Ensure preprocessing matches the original implementation exactly.

  3. TensorFlow Version: SavedModel format requires compatible TF version for loading.

  4. GPU Memory: TensorFlow models use memory growth to prevent OOM. Configure as needed.