Publications
For the up-to-date publication list, please visit the Google Scholar page.
* Equal contribution. † Equal advising.
2024

Few-View Object Reconstruction with Unknown Categories and Camera Poses
International Conference on 3D Vision (3DV), March 2024
Oral Presentation
2023

Deep Imitation Learning for Humanoid Loco-manipulation through Human Teleoperation
International Conference on Humanoid Robots (Humanoids), December 2023
Oral Presentation

LIBERO: Benchmarking Knowledge Transfer in Lifelong Robot Learning
NeurIPS 2023 Datasets and Benchmarks Track, December 2023

Cross-Episodic Curriculum for Transformer Agents
Conference on Neural Information Processing Systems (NeurIPS), December 2023

Re-ViLM: Retrieval-Augmented Visual Language Model for Zero and Few-Shot Image Captioning
Conference on Empirical Methods in Natural Language Processing (EMNLP), December 2023

LOTUS: Continual Imitation Learning for Robot Manipulation Through Unsupervised Skill Discovery
Technical Report arXiv:2310.17552, November 2023

Learning Generalizable Manipulation Policies with Object-Centric 3D Representations
Conference on Robot Learning (CoRL), November 2023

MimicGen: A Data Generation System for Scalable Robot Learning using Human Demonstrations
Conference on Robot Learning (CoRL), November 2023

MUTEX: Learning Unified Policies from Multimodal Task Specifications
Conference on Robot Learning (CoRL), November 2023

MimicPlay: Long-Horizon Imitation Learning by Watching Human Play
Conference on Robot Learning (CoRL), November 2023
Best Paper Award Finalist

Interactive Robot Learning from Verbal Correction
Technical Report arXiv:2310.17555, October 2023

Model-Based Runtime Monitoring with Interactive Imitation Learning
Technical Report arXiv:2310.17552, October 2023

Eureka: Human-Level Reward Design via Coding Large Language Models
Technical report arXiv:2310.12931, October 2023

AMAGO: Scalable In-Context Reinforcement Learning for Adaptive Agents
Technical Report arXiv:2310.09971, October 2023

Open X-Embodiment: Robotic Learning Datasets and RT-X Models
Technical Report arXiv:2310.08864, October 2023

Symbolic State Space Optimization for Long Horizon Mobile Manipulation Planning
International Conference on Intelligent Robots and Systems (IROS), October 2023

Doduo: Dense Visual Correspondence from Unsupervised Semantic-Aware Flow
Technical report arXiv:2309.15110, September 2023

ACID: Action-Conditional Implicit Visual Dynamics for Deformable Object Manipulation
International Journal of Robotics Research (IJRR), July 2023

VIMA: General Robot Manipulation with Multimodal Prompts
International Conference on Machine Learning (ICML), July 2023

Robot Learning on the Job: Human-in-the-Loop Autonomy and Learning During Deployment
Robotics: Science and Systems (RSS), July 2023
Best Paper Award Finalist

Fast Monocular Scene Reconstruction with Global-Sparse Local-Dense Grids
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2023

Ditto in the House: Building Articulated Models of Indoor Scenes through Interactive Perception
IEEE International Conference on Robotics and Automation (ICRA), May 2023

Learning to Walk by Steering: Perceptive Quadrupedal Locomotion in Dynamic Environments
IEEE International Conference on Robotics and Automation (ICRA), May 2023

Voyager: An Open-Ended Embodied Agent with Large Language Models
Technical report arXiv:2305.16291, May 2023
2022

Learning and Retrieval from Prior Data for Skill-based Imitation Learning
Conference on Robot Learning (CoRL), December 2022

VIOLA: Imitation Learning for Vision-Based Manipulation with Object Proposal Priors
Conference on Robot Learning (CoRL), December 2022

MineDojo: Building Open-Ended Embodied Agents with Internet-Scale Knowledge
NeurIPS 2022 Datasets and Benchmarks Track, November 2022
Outstanding Paper Award

Pre-Trained Language Models for Interactive Decision-Making
Conference on Neural Information Processing Systems (NeurIPS), November 2022
Oral Presentation

Causal Dynamics Learning for Task-Independent State Abstraction
International Conference on Machine Learning (ICML), July 2022
Long Presentation

ACID: Action-Conditional Implicit Visual Dynamics for Deformable Object Manipulation
Robotics: Science and Systems (RSS), June 2022
Best Student Paper Award Finalist

COOPERNAUT: End-to-End Driving with Cooperative Perception for Networked Vehicles
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2022

Ditto: Building Digital Twins of Articulated Objects from Interaction
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2022
Oral Presentation

Bongard-HOI: Benchmarking Few-Shot Visual Reasoning for Human-Object Interactions
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2022
Oral Presentation

Augmenting Reinforcement Learning with Behavior Primitives for Diverse Manipulation Tasks
IEEE International Conference on Robotics and Automation (ICRA), May 2022
Outstanding Learning Paper Award

OSCAR: Data-Driven Operational Space Control for Adaptive and Robust Robot Manipulation
IEEE International Conference on Robotics and Automation (ICRA), May 2022

Visually Grounded Task and Motion Planning for Mobile Manipulation
IEEE International Conference on Robotics and Automation (ICRA), May 2022

RelViT: Concept-Guided Vision Transformer for Visual Relational Reasoning
International Conference on Learning Representations (ICLR), April 2022

Bottom-Up Skill Discovery from Unsegmented Demonstrations for Long-Horizon Robot Manipulation
IEEE Robotics and Automation Letters (RA-L), January 2022
2021

Adversarial Skill Chaining for Long-Horizon Robot Manipulation via Terminal State Regularization
Conference on Robot Learning (CoRL), November 2021

What Matters in Learning from Offline Human Demonstrations for Robot Manipulation
Conference on Robot Learning (CoRL), November 2021
Oral Presentation

DiscoBox: Weakly Supervised Instance Segmentation and Semantic Correspondence from Box Supervision
International Conference on Computer Vision (ICCV), October 2021

Learning Generalizable Skills via Automated Generation of Diverse Tasks
Robotics: Science and Systems (RSS), July 2021

Synergies Between Affordance and Geometry: 6-DoF Grasp Detection via Implicit Representations
Robotics: Science and Systems (RSS), July 2021

SECANT: Self-Expert Cloning for Zero-Shot Generalization of Visual Policies
International Conference on Machine Learning (ICML), July 2021

Coach-Player Multi-Agent Reinforcement Learning for Dynamic Team Composition
International Conference on Machine Learning (ICML), July 2021
Long Talk

Tesseract: Tensorised Actors for Multi-Agent Reinforcement Learning
International Conference on Machine Learning (ICML), July 2021

MultiBench: Multiscale Benchmarks for Multimodal Representation Learning
NeurIPS 2021 Datasets and Benchmarks Track, July 2021

Fast Uncertainty Quantification for Deep Object Pose Estimation
IEEE International Conference on Robotics and Automation (ICRA), May 2021

Hierarchical Planning for Long-Horizon Manipulation with Geometric and Symbolic Scene Graphs
IEEE International Conference on Robotics and Automation (ICRA), May 2021

Deep Affordance Foresight: Planning Through What Can Be Done in the Future
IEEE International Conference on Robotics and Automation (ICRA), May 2021

Detect, Reject, Correct: Crossmodal Compensation of Corrupted Sensors
IEEE International Conference on Robotics and Automation (ICRA), May 2021

Emergent Hand Morphology and Control from Optimizing Robust Grasps of Diverse Objects
IEEE International Conference on Robotics and Automation (ICRA), May 2021

Learning Multi-Arm Manipulation Through Collaborative Teleoperation
IEEE International Conference on Robotics and Automation (ICRA), May 2021
Best Multi-Robotic Systems Paper Award Finalist

Adaptive Procedural Task Generation for Hard-Exploration Problems
International Conference on Learning Representations (ICLR), May 2021
2020

Human-in-the-Loop Imitation Learning using Remote Teleoperation
Technical report arXiv:2012.06733, December 2020

Bongard-LOGO: A New Benchmark for Human-Level Concept Learning and Reasoning
Conference on Neural Information Processing Systems (NeurIPS), December 2020
Spotlight Presentation

Learning a Contact-Adaptive Controller for Robust, Efficient Legged Locomotion
Conference on Robot Learning (CoRL), November 2020

robosuite: A Modular Simulation Framework and Benchmark for Robot Learning
Technical report arXiv:2009.12293, September 2020

RubiksNet: Learnable 3D-Shift for Efficient Video Action Recognition
European Conference on Computer Vision (ECCV), August 2020
* indicates equal contribution

OCEAN: Online Task Inference for Compositional Tasks with Context Adaptation
Conference on Uncertainty in Artificial Intelligence (UAI), August 2020

DualSMC: Tunneling Differentiable Filtering and Planning under Continuous POMDPs
International Joint Conference on Artificial Intelligence (IJCAI), July 2020
* indicates equal contribution

KETO: Learning Keypoint Representations for Tool Manipulation
IEEE International Conference on Robotics and Automation (ICRA), May 2020

6-PACK: Category-Level 6D Pose Tracker with Anchor-Based Keypoints
IEEE International Conference on Robotics and Automation (ICRA), May 2020

Making Sense of Vision and Touch: Learning Multimodal Representations for Contact-Rich Tasks
IEEE Transactions on Robotics (T-RO), March 2020