AI 비교하기AI 교차검증AI 최신정보AI 커뮤니티
NoticeAIB beta opens in late June! 🚀Compare AI, share your experiences, and learn together with the community. See you soon! ☺️
Our VisionTermsPrivacyFAQContact

MACE-Dance Framework Enables Music-Driven Dance Video Generation

MACE-Dance Framework Enables Music-Driven Dance Video Generation

HuggingFace
Tuesday, May 12, 2026
  • •MACE-Dance generates music-driven dance videos using a cascaded Mixture-of-Experts architecture.
  • •The framework separates tasks into a Motion Expert for 3D generation and an Appearance Expert for video synthesis.
  • •MACE-Dance achieves state-of-the-art performance in 3D dance generation and pose-driven image animation.
  • •MACE-Dance generates music-driven dance videos using a cascaded Mixture-of-Experts architecture.
  • •The framework separates tasks into a Motion Expert for 3D generation and an Appearance Expert for video synthesis.
  • •MACE-Dance achieves state-of-the-art performance in 3D dance generation and pose-driven image animation.

Researchers introduced MACE-Dance, a framework for generating dance videos from music, on May 11, 2026. The system utilizes a cascaded Mixture-of-Experts (MoE) architecture to decouple video synthesis into motion generation and appearance preservation, addressing current limitations in joint visual quality and realistic human movement.

The framework divides processing between two specialized components. The Motion Expert handles music-to-3D motion using a BiMamba-Transformer hybrid model combined with a Guidance-Free Training (GFT) strategy to ensure kinematic plausibility. The Appearance Expert manages video synthesis, maintaining identity and spatiotemporal coherence.

The system demonstrates state-of-the-art (SOTA) performance in 3D dance generation and pose-driven image animation. To validate these results, the authors curated a new large-scale dataset and established a specific motion-appearance evaluation protocol.

Researchers introduced MACE-Dance, a framework for generating dance videos from music, on May 11, 2026. The system utilizes a cascaded Mixture-of-Experts (MoE) architecture to decouple video synthesis into motion generation and appearance preservation, addressing current limitations in joint visual quality and realistic human movement.

The framework divides processing between two specialized components. The Motion Expert handles music-to-3D motion using a BiMamba-Transformer hybrid model combined with a Guidance-Free Training (GFT) strategy to ensure kinematic plausibility. The Appearance Expert manages video synthesis, maintaining identity and spatiotemporal coherence.

The system demonstrates state-of-the-art (SOTA) performance in 3D dance generation and pose-driven image animation. To validate these results, the authors curated a new large-scale dataset and established a specific motion-appearance evaluation protocol.

Read original (English)·May 12, 2026
#mace dance#video generation#mixture of experts#bimamba#motion synthesis