Debiasing Reward Models by Representation Learning with Guarantees | Arena Library | Arena