Learning Personalized Ad Impact via Contextual Reinforcement Learning under Delayed Rewards | Arena Library | Arena