UniVideo
EN CONSTRUCTION
Définition
xxxxx
Français
UniVideo
Anglais
xxxUniVideoxx
A unified framework that combines video understanding, generation, and editing capabilities within a single model. Unlike existing approaches that handle these tasks separately, UniVideo can interpret complex multimodal instructions and perform diverse video operations through a dual-stream architecture. The system demonstrates strong performance across multiple video tasks while enabling novel capabilities like visual prompt understanding and task composition. UniVideo, a dual-stream framework combining a Multimodal Large Language Model and a Multimodal DiT, extends unified modeling to video generation and editing, achieving state-of-the-art performance and supporting task composition and generalization.
Sources
Contributeurs: wiki





