Skip to main content
Livepeer orchestrators use NVIDIA GPUs for video transcoding (NVENC/NVDEC) and AI inference (CUDA / Tensor cores). AMD and Intel GPUs are not supported.

Supported GPUs

GPU familyTranscodingAI inferenceNotes
GeForce RTX 40xx (Ada)YesYesBest consumer option; AV1 encode
GeForce RTX 30xx (Ampere)YesYesWidely used; good price/performance
GeForce RTX 20xx (Turing)YesYesSupported but older
GeForce GTX 16xx (Turing)YesLimitedNo Tensor cores — AI slower/unsupported for some pipelines
GeForce GTX 10xx (Pascal)YesLimitedLegacy; NVENC Gen 6; no Tensor cores
Tesla T4YesYesData center, 16 GB, common in cloud
Tesla V100YesYesData center, 16/32 GB
A100YesYesData center, 40/80 GB, highest throughput
A10 / A10GYesYesCloud-optimized (AWS G5), 24 GB
L4YesYesAda data center, 24 GB, good for AI
L40 / L40SYesYes48 GB, high-end AI + transcoding
H100OverkillYes80 GB, primarily LLM / large-model inference

NVENC session limits

Consumer GPUs cap concurrent NVENC encode sessions, which limits simultaneous transcode streams per GPU.
GPU classDefault NVENC sessions
GeForce GTX 10xx2Can be patched
GeForce GTX 16xx3Can be patched
GeForce RTX 20xx3Can be patched
GeForce RTX 30xx3–5 (by model)Can be patched
GeForce RTX 40xx3–8 (by model)Can be patched
Tesla / Quadro / A-seriesUnlimitedNo session limit
The community nvidia-patch removes the limit on consumer GPUs and is widely used by orchestrators.
Patching modifies a system binary, is unsupported by NVIDIA, must be re-applied after driver updates, and may be disallowed on some managed cloud GPU instances.

VRAM by workload

WorkloadMinimum VRAMRecommendedNotes
Video transcoding only4 GB8 GBNVENC/NVDEC uses minimal VRAM
Batch AI (single warm model)8 GB16 GBSDXL needs ~7 GB
Batch AI (multiple warm models)16 GB24 GB+Each warm model consumes VRAM simultaneously
LLM inference (quantized)8 GB16 GBVia Ollama runner, quantized weights
LLM inference (full precision)24 GB+48 GB+Large models at full precision
Real-time AI (ComfyStream)12 GB16 GB+Latency-sensitive; headroom improves stability

Driver & toolkit versions

ComponentMinimumNotes
NVIDIA driver525+
CUDA toolkit12.0+
NVIDIA Container ToolkitLatestRequired for Docker (AI runner, containerized orchestrator)
nvidia-smi          # driver version
nvcc --version      # CUDA version
docker run --gpus all nvidia/cuda:12.0-base nvidia-smi   # Docker GPU access

GPU selection guidance

GoalPick
Transcoding only, budgetGTX 1660 Super (6 GB); patch the NVENC limit for more sessions
Transcoding + AIRTX 4070 Ti Super (16 GB) or RTX 3090 (24 GB) — 24 GB runs 2–3 warm models + transcoding
AI-heavy / LLMRTX 4090 (24 GB), or A100 / L40S in a data center

Configure your orchestrator

GPU selection and session-limit flags.

Add AI inference

Match a pipeline to your available VRAM.