DeepSeek V3.2 is a large-scale Mixture-of-Experts language model that harmonizes high computational efficiency with frontier-level reasoning and agentic tool-use performance. It introduces DeepSeek Sparse Attention (DSA), a fine-grained mechanism that reduces attention complexity from quadratic to linear, significantly cutting training and inference costs in long-context scenarios. Through scalable reinforcement learning post-training, it achieves performance comparable to GPT-5, with gold-medal results on the 2025 International Mathematical Olympiad and Olympiad in Informatics. The model also features a large-scale agentic task synthesis pipeline that improves instruction following and tool use in complex interactive environments.
DeepSeek V3.2 is a large-scale Mixture-of-Experts language model that harmonizes high computational efficiency with frontier-level reasoning and agentic tool-use performance. It introduces DeepSeek Sparse Attention (DSA), a fine-grained mechanism that reduces attention complexity from quadratic to linear, significantly cutting training and inference costs in long-context scenarios. Through scalable reinforcement learning post-training, it achieves performance comparable to GPT-5, with gold-medal results on the 2025 International Mathematical Olympiad and Olympiad in Informatics. The model also features a large-scale agentic task synthesis pipeline that improves instruction following and tool use in complex interactive environments.