New Evaluation Framework Exposes Hidden Autonomous Agent Risks | aib vote