E-Scores for (In)Correctness Assessment of Generative Model Outputs | Arena Library | Arena