Sampling-based Exploration for Reinforcement Learning of Dexterous Manipulation

Gagan Khandate

Columbia University

Siqi Shang

Columbia University

Eric T Chang

Columbia University

Tristan L Saidi

Columbia University

Johnson Adams

Columbia University

Matei Ciocarlie

Columbia University

Paper ID 20

Session 3. Self-supervision and RL for Manipulation

Poster Session Tuesday, July 11

Poster 20

Abstract: In this paper, we present a novel method for achieving dexterous manipulation of complex objects, while simultaneously securing the object without the use of passive support surfaces. We posit that a key difficulty for training such policies in a Reinforcement Learning framework is the difficulty of exploring the problem state space, as the accessible regions of this space form a complex structure along manifolds of a high-dimensional space. To address this challenge, we use two versions of the non-holonomic Rapidly-Exploring Random Trees algorithm; one version is more general, but requires explicit use of the environment’s transition function, while the second version uses manipulation-specific kinematic constraints to attain better sample efficiency. In both cases, we use states found via sampling-based exploration to generate reset distributions that enable training control policies under full dynamic constraints via model-free Reinforcement Learning. We show that these policies are effective at manipulation problems of higher difficulty than previously shown, and also transfer effectively to real robots.