Setting up from scratch? Do Run your first orchestrator
first, then come back here. AI pipelines require Linux.
1. Check your available VRAM
AI inference runs in a separate Docker container. If it shares a GPU with transcoding, VRAM is split between them. Check what’s free:| Pipeline | Min VRAM |
|---|---|
image-to-text | 4 GB |
segment-anything-2 | 6 GB |
llm (quantized 7–8B) | 8 GB |
audio-to-text (Whisper) | 12 GB |
image-to-video | 16 GB+ |
image-to-image | 20 GB |
text-to-image (SD/SDXL) | 24 GB |
2. Pull the AI runner image
3. Configure aiModels.json
This file tells your node which pipelines and models to serve, what to charge, and what to keep warm
in VRAM. Create ~/.lpData/aiModels.json with at least one entry:
| Field | Required | Description |
|---|---|---|
pipeline | Yes | Pipeline name (e.g. text-to-image, audio-to-text, llm) |
model_id | Yes | Hugging Face model ID (must be on the Livepeer-verified list) |
price_per_unit | Yes | Price in wei per unit |
warm | No | If true, preload into VRAM on startup |
capacity | No | Max concurrent requests (default 1) |
optimization_flags | No | SFAST (DEEPCACHE ( |
4. Enable the AI worker
Add three flags to your startup command:| Flag | What it does |
|---|---|
-aiWorker | Enables the AI worker; without it, all AI config is ignored |
-aiModels | Path to aiModels.json |
-aiModelsDir | Host directory holding cached model weights |
8936 to avoid clashing with the transcoding orchestrator on 8935:
5. Verify AI is active
Within seconds of startup you should see a managed-container log line for each warm model:images array. Finally, confirm your pipelines appear externally at
tools.livepeer.cloud/ai/network-capabilities
(search your orchestrator address; allow 2–5 minutes).
If jobs still don’t arrive, check aiModels.json is valid, the model_id matches a verified model,
and the runner is reachable — see the AI troubleshooting entries in the FAQ.
Next
Set pricing
Price each pipeline and model competitively.
Hardware reference
VRAM planning and GPU selection for AI.