Spaces:
Running
Running
| # MinerU RunPod Serverless Deployment | |
| ## Overview | |
| This deployment includes MinerU models directly in the Docker image for fast cold starts on RunPod Serverless. | |
| ## Build and Deploy | |
| ### 1. Build Docker Image | |
| ```bash | |
| ./build_runpod.sh | |
| ``` | |
| This will: | |
| - Build the Docker image with all MinerU models included | |
| - Download models during build (this takes ~10-15 minutes) | |
| - Result in a Docker image of approximately 5-10GB | |
| ### 2. Push to Docker Hub | |
| ```bash | |
| docker login | |
| docker push marcosremar2/mineru-runpod:latest | |
| ``` | |
| ### 3. Deploy on RunPod | |
| 1. Go to [RunPod Serverless](https://www.runpod.io/console/serverless) | |
| 2. Click "New Template" | |
| 3. Configure: | |
| - **Container Image**: `marcosremar2/mineru-runpod:latest` | |
| - **Container Disk**: 20 GB (to be safe) | |
| - **Volume Size**: 0 GB (not needed, models in image) | |
| - **GPU**: Any GPU with 8GB+ VRAM | |
| - **Max Workers**: Based on your needs | |
| - **Idle Timeout**: 5 seconds | |
| - **Execution Timeout**: 120 seconds | |
| ### 4. Test the Deployment | |
| ```bash | |
| python test_runpod.py test.pdf https://api.runpod.ai/v2/YOUR_ENDPOINT_ID YOUR_API_KEY | |
| ``` | |
| ## API Usage | |
| ### Request Format | |
| ```json | |
| { | |
| "input": { | |
| "pdf_base64": "base64_encoded_pdf_content", | |
| "filename": "document.pdf" | |
| } | |
| } | |
| ``` | |
| ### Response Format | |
| ```json | |
| { | |
| "output": { | |
| "markdown": "# Converted Document\n\nContent here...", | |
| "filename": "document.pdf", | |
| "status": "success", | |
| "pages": 5 | |
| } | |
| } | |
| ``` | |
| ## Cost Estimation | |
| - **Cold Start**: ~5-10 seconds (models already in image) | |
| - **Processing**: ~10-30 seconds per PDF | |
| - **GPU Cost**: ~$0.00024/second | |
| - **Total per PDF**: ~$0.01-0.02 | |
| ## Optimization Tips | |
| 1. **Reduce Image Size**: Remove unnecessary models from Dockerfile | |
| 2. **Use Active Workers**: For consistent load, keep 1-2 active workers | |
| 3. **Adjust Timeout**: Increase for larger PDFs | |
| 4. **Monitor Usage**: Use RunPod dashboard to track costs | |
| ## Troubleshooting | |
| 1. **Out of Memory**: Use larger GPU (16GB+ VRAM) | |
| 2. **Timeout**: Increase execution timeout in template | |
| 3. **Model Loading**: Check MINERU_MODEL_PATH environment variable |