spot_img
HomeNews & Current EventsMicrosoft Unveils NUWA-XL: A Multimodal AI Model Generating 16-Minute...

Microsoft Unveils NUWA-XL: A Multimodal AI Model Generating 16-Minute Videos from Text Descriptions

TLDR: Microsoft Research Asia has introduced NUWA-XL, an advanced multimodal generative AI model capable of producing 16-minute video content from just 11 descriptive sentences. This innovation leverages a ‘diffusion over diffusion’ architecture to ensure efficiency and continuity in content creation, marking a significant leap in AI-powered animation and video production.

Microsoft Research Asia has announced the development of NUWA-XL, a groundbreaking multimodal automatic generative artificial intelligence model. This new AI is capable of generating extensive video content, specifically up to 16 minutes in length, using a mere 11 sets of descriptive sentences. This advancement represents a substantial step forward in AI-powered content creation, particularly for applications like animation production.

The NUWA-XL model is built upon an innovative ‘diffusion over diffusion’ operational architecture. This sophisticated design incorporates a global diffusion model responsible for generating key frames across the entire temporal span of a video. Complementing this, a local diffusion model then meticulously adds adjacent content to these key frames. This dual-diffusion approach is crucial for accelerating the overall content generation efficiency while simultaneously ensuring the continuity and integrity of the produced video content.

This latest iteration follows Microsoft Research Asia’s earlier successes in multimodal AI. In 2021, the original NUWA (Nuwa) model was introduced, demonstrating the ability to generate text, images, and video content from natural language descriptions. Subsequently, the NUWA-Infinity version further enhanced the resolution capabilities for generated images and videos. NUWA-XL builds on these foundations, pushing the boundaries of video length and coherence from textual input.

Also Read:

Industry observers anticipate that the NUWA-XL model will significantly impact various sectors, most notably by streamlining and accelerating the production of animation and other video-based content.

Ananya Rao
Ananya Raohttps://blogs.edgentiq.com
Ananya Rao is a tech journalist with a passion for dissecting the fast-moving world of Generative AI. With a background in computer science and a sharp editorial eye, she connects the dots between policy, innovation, and business. Ananya excels in real-time reporting and specializes in uncovering how startups and enterprises in India are navigating the GenAI boom. She brings urgency and clarity to every breaking news piece she writes. You can reach her out at: [email protected]

- Advertisement -

spot_img

Gen AI News and Updates

spot_img

- Advertisement -