SciCode

About This Benchmark

A benchmark of research-level scientific coding problems in math, physics, chemistry, and biology. Evaluates code generation at the level of real scientific research. Score is accuracy (%).

Source: Artificial Analysis