HunyuanVideo-I2V is a customizable image-to-video generation framework from Tencent Hunyuan, built on their HunyuanVideo foundation. It extends video generation so that given a static reference image plus an optional prompt, it generates a video sequence that preserves the reference image’s identity (especially in the first frame) and allows stylized effects via LoRA adapters. The repository includes pretrained weights, inference and sampling scripts, training code for LoRA effects, and support for parallel inference via xDiT. Resolution, video length, stability mode, flow shift, seed, CPU offload etc. Parallel inference support using xDiT for multi-GPU speedups. LoRA training / fine-tuning support to add special effects or customize generation.
Features
- Generates videos from single reference images by combining image tokens with video latent tokens in full-attention architecture
- LoRA training / fine-tuning support to add special effects or customize generation
- Parallel inference support using xDiT for multi-GPU speedups
- Ensures “first frame consistency” to preserve identity of reference image across video
- Configurable inference options: resolution, video length, stability mode, flow shift, seed, CPU offload etc.
- Detailed scripts and examples (sample_image2video.py, train_image2video_lora, etc.) and dependency management info