What are the key points?

LoopCoder-v2, a 7B parameter parallel loop Transformer, improves code generation using two compute loops. The model achieved 64.4 on SWE-bench Verified, up from 43.0, and 31.0 on Multi-SWE, up from 14.0. Researchers identified a non-monotonic loop-count effect where performance regresses after the second loop due to offset costs.

LoopCoder-v2 Introduces Efficient Two-Loop Transformer Scaling

•LoopCoder-v2, a 7B parameter parallel loop Transformer, improves code generation using two compute loops.
•The model achieved 64.4 on SWE-bench Verified, up from 43.0, and 31.0 on Multi-SWE, up from 14.0.
•Researchers identified a non-monotonic loop-count effect where performance regresses after the second loop due to offset costs.

Jian Yang and a team of researchers released LoopCoder-v2, a series of 7B parallel loop Transformer (PLT) models designed for code generation, on June 16, 2026. The researchers trained the models from scratch on 18T tokens, followed by instruction tuning. The study examines loop-count selection in PLT architectures, where repeatedly applying shared blocks scales latent computation. While two-loop variants improve performance, the study finds that adding three or more loops leads to diminishing returns and performance regression.

Empirical results show that the two-loop version significantly outperforms the non-looped baseline. On the SWE-bench Verified benchmark, the two-loop model improved scores from 43.0 to 64.4 points. Similarly, on the Multi-SWE benchmark, performance increased from 14.0 to 31.0 points. The authors attribute these gains to productive representation refinement in the second loop, while subsequent loops introduce oscillatory updates and representational diversity reduction. Furthermore, cross-loop position offsets, while necessary, introduce positional mismatches that eventually outweigh refinement benefits, creating a non-monotonic trade-off.

Jian Yang and a team of researchers released LoopCoder-v2, a series of 7B parallel loop Transformer (PLT) models designed for code generation, on June 16, 2026. The researchers trained the models from scratch on 18T tokens, followed by instruction tuning. The study examines loop-count selection in PLT architectures, where repeatedly applying shared blocks scales latent computation. While two-loop variants improve performance, the study finds that adding three or more loops leads to diminishing returns and performance regression.

Empirical results show that the two-loop version significantly outperforms the non-looped baseline. On the SWE-bench Verified benchmark, the two-loop model improved scores from 43.0 to 64.4 points. Similarly, on the Multi-SWE benchmark, performance increased from 14.0 to 31.0 points. The authors attribute these gains to productive representation refinement in the second loop, while subsequent loops introduce oscillatory updates and representational diversity reduction. Furthermore, cross-loop position offsets, while necessary, introduce positional mismatches that eventually outweigh refinement benefits, creating a non-monotonic trade-off.