LongCat Flash Chat is a large-scale Mixture-of-Experts model from Meituan with 560 billion total parameters, dynamically activating 18.6B to 31.3B (averaging ~27B) based on contextual demands. Its shortcut-connected MoE design achieves over 100 tokens per second during inference while supporting a 128K-token context window. The model delivers highly competitive performance in reasoning, coding, and instruction following, with exceptional strengths in agentic tasks and complex multi-step tool-use interactions.
LongCat Flash Chat is a large-scale Mixture-of-Experts model from Meituan with 560 billion total parameters, dynamically activating 18.6B to 31.3B (averaging ~27B) based on contextual demands. Its shortcut-connected MoE design achieves over 100 tokens per second during inference while supporting a 128K-token context window. The model delivers highly competitive performance in reasoning, coding, and instruction following, with exceptional strengths in agentic tasks and complex multi-step tool-use interactions.