Predicting and improving test-time scaling laws via reward tail-guided search | Arena Library | Arena