✅ Multi-model T2V ✅ GPU offload & BF16 ✅ Parallel batch processing ✅ Prometheus metrics ✅ Docker-based deployment ✅ Pydantic-based config ✅ S3 integration for MP4s ✅ Minimal code, easy to extend
- Introduction
- Quick Start
- Usage Examples
- Features
- Prompt Engineering
- Docker Support
- Monitoring & Logging
- License
Daifuku is a versatile framework designed to serve multiple Text-to-Video (T2V) models like Mochi and LTX. It simplifies T2V model deployment by providing:
- A unified API for multiple models.
- Support for parallel batch processing.
- GPU optimizations for efficiency.
- Easy Docker-based deployment.
- Integrated monitoring and logging.
Follow these steps to set up Daifuku locally:
git clone https://github.com/YourUserName/Daifuku.git
cd Daifuku
# Create a virtual environment
pip install uv
uv venv .venv
source .venv/bin/activate
uv pip install -r requirements.txt
uv pip install -e . --no-build-isolation
Optional: Download Mochi weights for faster first use:
python scripts/download_weights.py
Note: LTX weights download automatically on first usage.
Daifuku can serve models individually or combine them behind one endpoint:
Mochi-Only Server
python api/mochi_serve.py
# Endpoint: http://127.0.0.1:8000/api/v1/video/mochi
LTX-Only Server
python api/ltx_serve.py
# Endpoint: http://127.0.0.1:8000/api/v1/video/ltx
Combined Server
python api/serve.py
# Endpoint: http://127.0.0.1:8000/predict
# Specify "model_name": "mochi" or "model_name": "ltx" in the request payload.
import requests
url = "http://127.0.0.1:8000/api/v1/video/mochi"
payload = {
"prompt": "A serene beach at dusk, gentle waves, dreamy pastel colors",
"num_inference_steps": 40,
"guidance_scale": 4.0,
"height": 480,
"width": 848,
"num_frames": 120,
"fps": 10
}
response = requests.post(url, json=payload)
print(response.json())
import requests
url = "http://127.0.0.1:8000/api/v1/video/ltx"
payload = {
"prompt": "A cinematic scene of autumn leaves swirling around the forest floor",
"negative_prompt": "blurry, worst quality",
"num_inference_steps": 40,
"guidance_scale": 3.0,
"height": 480,
"width": 704,
"num_frames": 121,
"frame_rate": 25
}
response = requests.post(url, json=payload)
print(response.json())
Process multiple requests simultaneously with Daifuku's parallel capabilities:
curl -X POST http://127.0.0.1:8000/predict \
-H "Content-Type: application/json" \
-d '{
"batch": [
{
"model_name": "mochi",
"prompt": "A calm ocean scene, sunrise, realistic",
"num_inference_steps": 40
},
{
"model_name": "ltx",
"prompt": "A vintage film style shot of the Eiffel Tower",
"height": 480,
"width": 704
}
]
}'
-
Multi-Model T2V
- Serve Mochi or LTX individually or unify them under one endpoint.
-
Parallel Batch Processing
- Handle multiple requests concurrently for high throughput.
-
GPU Optimizations
- Features like BF16 precision, attention slicing, and VAE tiling for efficient GPU use.
-
Prometheus Metrics
- Monitor request latency, GPU usage, and more.
-
S3 Integration
- Automatically upload
.mp4
files to Amazon S3 and return signed URLs.
- Automatically upload
-
Pydantic Config
- Configurable schemas for Mochi (
mochi_settings.py
), LTX (ltx_settings.py
), and combined setups.
- Configurable schemas for Mochi (
-
Advanced Logging
- Uses Loguru for detailed and structured logging.
-
Mochi:
- Optimized for creative and artistic prompts.
- Recommended: ~50 steps, guidance scale 4.0–7.5, resolution up to 768×768.
-
LTX:
- Ideal for cinematic or photo-realistic scenes.
- Recommended: Height/width multiples of 32, frame counts like
8n+1
(e.g., 121, 161), guidance scale ~3.0.
Daifuku provides a Dockerfile for streamlined deployment:
docker build -t daifuku -f DockerFileFolder/Dockerfile .
docker run --gpus all -p 8000:8000 daifuku
Modify the CMD
in the Dockerfile to switch between Mochi, LTX, or combined server modes.
Key metrics include:
- GPU memory usage (allocated & peak).
- Inference duration (histogram).
- Request throughput.
Endpoints:
- Mochi:
/api/v1/metrics
- LTX:
/api/v1/metrics
- Combined:
/metrics
- Logs rotate at 100 MB and retain up to 1 week.
- Find logs in:
logs/api.log
(Mochi)logs/ltx_api.log
(LTX)logs/combined_api.log
Daifuku is licensed under the MIT License.