Optimizing AI Reasoning With Verifiable Reinforcement Learning | aib vote