Skip to main content

Home Hot Groups Market Me

On Instrumental Variable Regression for Deep Offline Policy Evaluation | Arena Library | Arena

Rankings Groups Feed Market Hot

Home
Library
On Instrumental Variable Regression for Deep Offline Policy Evaluation

paper

On Instrumental Variable Regression for Deep Offline Policy Evaluation

Yutian Chen, Liyuan Xu, Caglar Gulcehre, Tom Le Paine, Arthur Gretton

On Instrumental Variable Regression for Deep Offline Policy Evaluation

Yutian Chen, Liyuan Xu, Caglar Gulcehre, Tom Le Paine, Arthur Gretton

paper2021-05-21English

machine learning financearxiv

Description

We show that the popular reinforcement learning (RL) strategy of estimating the state-action value (Q-function) by minimizing the mean squared Bellman error leads to a regression problem with confounding, the inputs and output noise being correlated. Hence, direct minimization of the Bellman error can result in significantly biased Q-function estimates. We explain why fixing the target Q-network in Deep Q-Networks and Fitted Q Evaluation provides a way of overcoming this confounding, thus sheddi...

Similar Books

Quantitative mode stability for the wave equation on the Kerr-Newman spacetime

Quantitative mode stability for the wave equation on the Kerr-Newman spacetime

Risk-Aware Objective-Based Forecasting in Inertia Management

Haipeng Zhang, Ran Li, Yan Chen, Zhongda Chu, Mingyang Sun

Risk-Aware Objective-Based Forecasting in Inertia Management

Chainalysis: Geography of Cryptocurrency 2023

Chainalysis: Geography of Cryptocurrency 2023

Periodicity in Cryptocurrency Volatility and Liquidity

Peter Reinhard Hansen, Chan Kim, Wade Kimbrough

Periodicity in Cryptocurrency Volatility and Liquidity

Impact of Geometric Uncertainty on the Computation of Abdominal Aortic Aneurysm Wall Strain

Saeideh Sekhavat, Mostafa Jamshidian, Adam Wittek, Karol Miller

Impact of Geometric Uncertainty on the Computation of Abdominal Aortic Aneurysm Wall Strain

Simulation-based Bayesian inference with ameliorative learned summary statistics -- Part I

Getachew K. Befekadu

Simulation-based Bayesian inference with ameliorative learned summary statistics -- Part I

Home Hot Groups Market Me