Benchmarking Reliability: Testing LLM Consistency in Structured Output | aib vote