New Benchmark Challenges AI Capability to Use Tools | aib vote