Adaptive Procedural Task Generation for Hard-Exploration Problems


We introduce Adaptive Procedural Task Generation (APT-Gen), an approach for progressively generating a sequence of tasks as curricula to facilitate reinforcement learning in hard-exploration problems. At the heart of our approach, a task generator learns to create tasks via a black-box procedural generation module by adaptively sampling from the parameterized task space. To enable curriculum learning in the absence of a direct indicator of learning progress, the task generator is trained by balancing the agent’s expected return in the generated tasks and their similarities to the target task. Through adversarial training, the similarity between the generated tasks and the target task is adaptively estimated by a task discriminator defined on the agent’s behaviors. In this way, our approach can efficiently generate tasks of rich variations for target tasks of unknown parameterization or not covered by the predefined task space. Experiments demonstrate the effectiveness of our approach through quantitative and qualitative analysis in various scenarios.