Kling Video O1 is Kuaishou's unified multimodal video model, the successor to the Kling 2.x line. Built on a Multimodal Visual Language (MVL) framework, it consolidates text-to-video, image-to-video, reference-to-video, start/end-frame control, in-painting, style re-rendering, and shot extension into a single engine. It generates up to 1080p clips at 5 or 10 seconds in 16:9, 9:16, or 1:1 aspect ratios, with strong identity retention for characters, props, and environments under dynamic camera motion. Natural-language editing, such as removing objects or changing lighting, works as executable prompts, making it well suited for cinematic production, iterative creative workflows, and advertising.
Vision|Proprietary Model