On the Power of (Approximate) Reward Models for Inference-Time Scaling | Arena Library | Arena