On-Policy Robot Imitation Learning from a Converging Supervisor | Arena Library | Arena