Publications
For the up-to-date publication list, please visit the Google Scholar page.
* Equal contribution. † Equal advising.
2024
AMAGO-2: Breaking the Multi-Task Barrier in Meta-Reinforcement Learning with Transformers
Conference on Neural Information Processing Systems (NeurIPS), December 2024
Harmon: Whole-Body Motion Generation of Humanoid Robots from Language Descriptions
Conference on Robot Learning (CoRL), November 2024
OKAMI: Teaching Humanoid Robots Manipulation Skills through Single Video Imitation
Conference on Robot Learning (CoRL), November 2024
Oral Presentation
Multi-Task Interactive Robot Fleet Learning with Visual World Models
Conference on Robot Learning (CoRL), November 2024
DexMimicGen: Automated Data Generation for Bimanual Dexterous Manipulation via Imitation Learning
Technical report arXiv:2410.24185, October 2024
HOVER: Versatile Neural Whole-Body Controller for Humanoid Robots
Technical report arXiv:2410.21229, October 2024
Robot Learning on the Job: Human-in-the-Loop Autonomy and Learning During Deployment
International Journal of Robotics Research (IJRR), Oct 2024
ARDuP: Active Region Video Diffusion for Universal Policies
International Conference on Intelligent Robots and Systems (IROS), October 2024
BUMBLE: Unifying Reasoning and Acting with Vision-Language Models for Building-wide Mobile Manipulation
Technical report arXiv:2410.06237, October 2024
PRIME: Scaffolding Manipulation Tasks with Behavior Primitives for Data-Efficient Imitation Learning
IEEE Robotics and Automation Letters (RA-L), October 2024
Foundation Models in Robotics: Applications, Challenges, and the Future
International Journal of Robotics Research (IJRR), September 2024
PRESTO: Fast Motion Planning Using Diffusion Models Based on Key-Configuration Environment Representation
Technical report arXiv:2409.16012, September 2024
LongVILA: Scaling Long-Context Visual Language Models for Long Videos
Technical report arXiv:2408.10188, August 2024
PIVOT: Iterative Visual Prompting Elicits Actionable Knowledge for VLMs
International Conference on Machine Learning (ICML), July 2024
DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset
Robotics: Science and Systems (RSS), July 2024
InterPreT: Interactive Predicate Learning from Language Feedback for Generalizable Task Planning
Robotics: Science and Systems (RSS), July 2024
DrEureka: Language Model Guided Sim-To-Real Transfer
Robotics: Science and Systems (RSS), July 2024
RoboCasa: Large-Scale Simulation of Everyday Tasks for Generalist Robots
Robotics: Science and Systems (RSS), July 2024
ORION: Vision-based Manipulation from Single Human Video with Open-World Object Graphs
Technical report arXiv:2405.20321, May 2024
Doduo: Dense Visual Correspondence from Unsupervised Semantic-Aware Flow
IEEE International Conference on Robotics and Automation (ICRA), May 2024
Model-Based Runtime Monitoring with Interactive Imitation Learning
IEEE International Conference on Robotics and Automation (ICRA), May 2024
Open X-Embodiment: Robotic Learning Datasets and RT-X Models
IEEE International Conference on Robotics and Automation (ICRA), May 2024
Best Conference Paper Award
LOTUS: Continual Imitation Learning for Robot Manipulation Through Unsupervised Skill Discovery
IEEE International Conference on Robotics and Automation (ICRA), May 2024
AMAGO: Scalable In-Context Reinforcement Learning for Adaptive Agents
International Conference on Learning Representations (ICLR), May 2024
Spotlight Presentation
Eureka: Human-Level Reward Design via Coding Large Language Models
International Conference on Learning Representations (ICLR), May 2024
Few-View Object Reconstruction with Unknown Categories and Camera Poses
International Conference on 3D Vision (3DV), March 2024
Oral Presentation
Granger Causal Interaction Skill Chains
Transactions on Machine Learning Research (TMLR), March 2024
Voyager: An Open-Ended Embodied Agent with Large Language Models
Transactions on Machine Learning Research (TMLR), March 2024
Building Minimal and Reusable Causal State Abstractions for Reinforcement Learning
AAAI Conference on Artificial Intelligence (AAAI), February 2024
Oral Presentation
2023
Deep Imitation Learning for Humanoid Loco-manipulation through Human Teleoperation
International Conference on Humanoid Robots (Humanoids), December 2023
Oral Presentation
LIBERO: Benchmarking Knowledge Transfer in Lifelong Robot Learning
NeurIPS 2023 Datasets and Benchmarks Track, December 2023
Cross-Episodic Curriculum for Transformer Agents
Conference on Neural Information Processing Systems (NeurIPS), December 2023
Re-ViLM: Retrieval-Augmented Visual Language Model for Zero and Few-Shot Image Captioning
Conference on Empirical Methods in Natural Language Processing (EMNLP), December 2023
Learning Generalizable Manipulation Policies with Object-Centric 3D Representations
Conference on Robot Learning (CoRL), November 2023
MimicGen: A Data Generation System for Scalable Robot Learning using Human Demonstrations
Conference on Robot Learning (CoRL), November 2023
MUTEX: Learning Unified Policies from Multimodal Task Specifications
Conference on Robot Learning (CoRL), November 2023
MimicPlay: Long-Horizon Imitation Learning by Watching Human Play
Conference on Robot Learning (CoRL), November 2023
Best Paper Award Finalist
Interactive Robot Learning from Verbal Correction
CoRL Workshop on Language and Robot Learning (LangRob), November 2023
Symbolic State Space Optimization for Long Horizon Mobile Manipulation Planning
International Conference on Intelligent Robots and Systems (IROS), October 2023
ACID: Action-Conditional Implicit Visual Dynamics for Deformable Object Manipulation
International Journal of Robotics Research (IJRR), July 2023
VIMA: General Robot Manipulation with Multimodal Prompts
International Conference on Machine Learning (ICML), July 2023
Robot Learning on the Job: Human-in-the-Loop Autonomy and Learning During Deployment
Robotics: Science and Systems (RSS), July 2023
Best Paper Award Finalist
Fast Monocular Scene Reconstruction with Global-Sparse Local-Dense Grids
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2023
Ditto in the House: Building Articulated Models of Indoor Scenes through Interactive Perception
IEEE International Conference on Robotics and Automation (ICRA), May 2023
Learning to Walk by Steering: Perceptive Quadrupedal Locomotion in Dynamic Environments
IEEE International Conference on Robotics and Automation (ICRA), May 2023
2022
Learning and Retrieval from Prior Data for Skill-based Imitation Learning
Conference on Robot Learning (CoRL), December 2022
VIOLA: Imitation Learning for Vision-Based Manipulation with Object Proposal Priors
Conference on Robot Learning (CoRL), December 2022
MineDojo: Building Open-Ended Embodied Agents with Internet-Scale Knowledge
NeurIPS 2022 Datasets and Benchmarks Track, November 2022
Outstanding Paper Award
Pre-Trained Language Models for Interactive Decision-Making
Conference on Neural Information Processing Systems (NeurIPS), November 2022
Oral Presentation
Causal Dynamics Learning for Task-Independent State Abstraction
International Conference on Machine Learning (ICML), July 2022
Long Presentation
ACID: Action-Conditional Implicit Visual Dynamics for Deformable Object Manipulation
Robotics: Science and Systems (RSS), June 2022
Best Student Paper Award Finalist
COOPERNAUT: End-to-End Driving with Cooperative Perception for Networked Vehicles
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2022
Ditto: Building Digital Twins of Articulated Objects from Interaction
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2022
Oral Presentation
Bongard-HOI: Benchmarking Few-Shot Visual Reasoning for Human-Object Interactions
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2022
Oral Presentation
Augmenting Reinforcement Learning with Behavior Primitives for Diverse Manipulation Tasks
IEEE International Conference on Robotics and Automation (ICRA), May 2022
Outstanding Learning Paper Award
OSCAR: Data-Driven Operational Space Control for Adaptive and Robust Robot Manipulation
IEEE International Conference on Robotics and Automation (ICRA), May 2022
Visually Grounded Task and Motion Planning for Mobile Manipulation
IEEE International Conference on Robotics and Automation (ICRA), May 2022
RelViT: Concept-Guided Vision Transformer for Visual Relational Reasoning
International Conference on Learning Representations (ICLR), April 2022
Bottom-Up Skill Discovery from Unsegmented Demonstrations for Long-Horizon Robot Manipulation
IEEE Robotics and Automation Letters (RA-L), January 2022
2021
Adversarial Skill Chaining for Long-Horizon Robot Manipulation via Terminal State Regularization
Conference on Robot Learning (CoRL), November 2021
What Matters in Learning from Offline Human Demonstrations for Robot Manipulation
Conference on Robot Learning (CoRL), November 2021
Oral Presentation
DiscoBox: Weakly Supervised Instance Segmentation and Semantic Correspondence from Box Supervision
International Conference on Computer Vision (ICCV), October 2021
Discovering Generalizable Skills via Automated Generation of Diverse Tasks
Robotics: Science and Systems (RSS), July 2021
Synergies Between Affordance and Geometry: 6-DoF Grasp Detection via Implicit Representations
Robotics: Science and Systems (RSS), July 2021
SECANT: Self-Expert Cloning for Zero-Shot Generalization of Visual Policies
International Conference on Machine Learning (ICML), July 2021
Coach-Player Multi-Agent Reinforcement Learning for Dynamic Team Composition
International Conference on Machine Learning (ICML), July 2021
Long Talk
Tesseract: Tensorised Actors for Multi-Agent Reinforcement Learning
International Conference on Machine Learning (ICML), July 2021
MultiBench: Multiscale Benchmarks for Multimodal Representation Learning
NeurIPS 2021 Datasets and Benchmarks Track, July 2021
Fast Uncertainty Quantification for Deep Object Pose Estimation
IEEE International Conference on Robotics and Automation (ICRA), May 2021
Hierarchical Planning for Long-Horizon Manipulation with Geometric and Symbolic Scene Graphs
IEEE International Conference on Robotics and Automation (ICRA), May 2021
Deep Affordance Foresight: Planning Through What Can Be Done in the Future
IEEE International Conference on Robotics and Automation (ICRA), May 2021
Detect, Reject, Correct: Crossmodal Compensation of Corrupted Sensors
IEEE International Conference on Robotics and Automation (ICRA), May 2021
Emergent Hand Morphology and Control from Optimizing Robust Grasps of Diverse Objects
IEEE International Conference on Robotics and Automation (ICRA), May 2021
Learning Multi-Arm Manipulation Through Collaborative Teleoperation
IEEE International Conference on Robotics and Automation (ICRA), May 2021
Best Multi-Robotic Systems Paper Award Finalist
Adaptive Procedural Task Generation for Hard-Exploration Problems
International Conference on Learning Representations (ICLR), May 2021
2020
Human-in-the-Loop Imitation Learning using Remote Teleoperation
Technical report arXiv:2012.06733, December 2020
Bongard-LOGO: A New Benchmark for Human-Level Concept Learning and Reasoning
Conference on Neural Information Processing Systems (NeurIPS), December 2020
Spotlight Presentation
Learning a Contact-Adaptive Controller for Robust, Efficient Legged Locomotion
Conference on Robot Learning (CoRL), November 2020
robosuite: A Modular Simulation Framework and Benchmark for Robot Learning
Technical report arXiv:2009.12293, September 2020
RubiksNet: Learnable 3D-Shift for Efficient Video Action Recognition
European Conference on Computer Vision (ECCV), August 2020
* indicates equal contribution
OCEAN: Online Task Inference for Compositional Tasks with Context Adaptation
Conference on Uncertainty in Artificial Intelligence (UAI), August 2020
DualSMC: Tunneling Differentiable Filtering and Planning under Continuous POMDPs
International Joint Conference on Artificial Intelligence (IJCAI), July 2020
* indicates equal contribution
KETO: Learning Keypoint Representations for Tool Manipulation
IEEE International Conference on Robotics and Automation (ICRA), May 2020
6-PACK: Category-Level 6D Pose Tracker with Anchor-Based Keypoints
IEEE International Conference on Robotics and Automation (ICRA), May 2020
Making Sense of Vision and Touch: Learning Multimodal Representations for Contact-Rich Tasks
IEEE Transactions on Robotics (T-RO), March 2020