Kling Video O1 is Kuaishou's unified multimodal video model, the successor to the Kling 2.x line. Built on a Multimodal Visual Language (MVL) framework, it consolidates text-to-video, image-to-video, reference-to-video, start/end-frame control, in-painting, style re-rendering, and shot extension into a single engine. It generates up to 1080p clips at 5 or 10 seconds in 16:9, 9:16, or 1:1 aspect ratios, with strong identity retention for characters, props, and environments under dynamic camera motion. Natural-language editing, such as removing objects or changing lighting, works as executable prompts, making it well suited for cinematic production, iterative creative workflows, and advertising.
Kling Video O1 is Kuaishou's unified multimodal video model, the successor to the Kling 2.x line. Built on a Multimodal Visual Language (MVL) framework, it consolidates text-to-video, image-to-video, reference-to-video, start/end-frame control, in-painting, style re-rendering, and shot extension into a single engine. It generates up to 1080p clips at 5 or 10 seconds in 16:9, 9:16, or 1:1 aspect ratios, with strong identity retention for characters, props, and environments under dynamic camera motion. Natural-language editing, such as removing objects or changing lighting, works as executable prompts, making it well suited for cinematic production, iterative creative workflows, and advertising.