HappyHorse 1.0 is Alibaba Taotian Future Life Lab's flagship video generation model, holding the #1 rank on the Artificial Analysis Video Arena for both text-to-video (Elo 1381, +107 over second place) and image-to-video (Elo 1392). Its unified 15B-parameter, 40-layer self-attention Transformer generates video and audio jointly in a single forward pass with no cross-attention modules. The model supports text-to-video, image-to-video, video editing, and reference-to-video workflows at 720p or 1080p, clip durations of 3–15 seconds, and native multilingual lip-sync across English, Mandarin, Cantonese, Japanese, Korean, German, and French. Reported generation speed is about 38 seconds for a 1080p clip on a single NVIDIA H100 GPU. Available on fal.ai at $0.14/second (720p) and $0.28/second (1080p).
Vision|Proprietary Model
Knowledge Cutoff
Unknown
Input → Output Format
Context Memory
N/A
Source:Official Docsfal.ai launch deep-diveCNBC — Alibaba revealed as creatorPR Newswire — fal launch