Kling AI Unveils ‘O1’ Unified Multimodal Model and ‘Video 2.6’ with Native Audio

News Update

3 hours ago

Kling AI Unveils ‘O1’ Unified Multimodal Model and ‘Video 2.6’ with Native Audio

Share

Kling AI Unveils ‘O1’ Unified Multimodal Model and ‘Video 2.6’ with Native Audio

Back to Blog

Generative tech pioneer Kling AI (China) has introduced its ‘O1’ series and ‘Video 2.6’ model, marking a significant leap toward unified multimedia production. The O1 model acts as a multimodal engine capable of interpreting text, images, and existing footage simultaneously to maintain character and object consistency. Complementing this, Video 2.6 introduces “Native Audio,” allowing the system to generate dialogue, music, and ambient sound effects synchronized with visual motion in a single workflow.

Launched on December 9, 2025, the update includes ‘Avatar 2.0’ and the ‘Element Library’, which remembers specific items and characters for multi-shot stability. The model supports human voices ranging from whispers to dramatic shouts, and environmental sounds like fire or shattering glass. This technology is aimed at reducing production costs for advertisers and influencers by providing high-fidelity, 10-second outputs that are ready for social media and commercial use.

The system’s ability to process up to ten reference images at once makes complex image editing accessible to non-professionals. Kling AI’s rapid release cycle—including the 2.5 Turbo mode—emphasizes its strategy to dominate the creative production supply chain. By offering end-to-end audio-visual generation, the company enables a faster, more cohesive creative engine for global digital content creators and e-commerce sellers. (China)

Not a Vitrina Member? Apply Now!

Vitrina tracks global Film & TV projects, partners, and deals—used to find vendors, financiers, commissioners, licensors, and licensees

Join Industry Briefings Trusted by Leaders

Learn More

Micro-Dramas & Mobile-First Storytelling: Who’s Creating, Who’s Buying?

Is short-form finally going long on value? Who’s betting big on 3-minute dramas—and who’s buying vertical-first IP? As mobile-first storytelling gains traction, Vitrina identifies the trailblazers and the monetization playbooks behind them. Who’s pioneering this form globally, and how are production financing models adapting to fit?

Learn More

Vitrina tracks global Film & TV projects, partners, and deals—used to find vendors, financiers, commissioners, licensors, and licensees

Not a Vitrina Member? Apply Now!

Kling AI Unveils ‘O1’ Unified Multimodal Model and ‘Video 2.6’ with Native Audio

Kling AI Unveils ‘O1’ Unified Multimodal Model and ‘Video 2.6’ with Native Audio

Not a Vitrina Member? Apply Now!

Join Industry Briefings Trusted by Leaders

Micro-Dramas & Mobile-First Storytelling: Who’s Creating, Who’s Buying?

Global TV+Film Productions Review – Year Round-Up 2025

White Space Opportunities in Animation

Deep dive into Co-Pros

Vitrina tracks global Film & TV projects, partners, and deals—used to find vendors, financiers, commissioners, licensors, and licensees

Similar Articles

LG Electronics Powers South Korea’s ‘Studio V’ with Advanced Virtual Production Solutions

BlueFocus Surpasses RMB 60 Billion in Revenue Driven by ‘All In AI’ Strategy

Wondershare Releases Filmora V15 Featuring ‘AI Mate’ and ‘AI Extend’ for Studio Results

MediaKind to Acquire Harmonic’s Video Business for $145 Million in Major SaaS Consolidation