Publications
For the up-to-date publication list, please visit the Google Scholar page.
* Equal contribution. † Equal advising.
2024
![](./images/huang-iros24-ardup.jpg)
ARDuP: Active Region Video Diffusion for Universal Policies
International Conference on Intelligent Robots and Systems (IROS), Oct 2024
![](./images/nasiriany-icml24-pivot.png)
PIVOT: Iterative Visual Prompting Elicits Actionable Knowledge for VLMs
International Conference on Machine Learning (ICML), July 2024
![](./images/ma-rss24-dreureka.png)
DrEureka: Language Model Guided Sim-To-Real Transfer
Robotics: Science and Systems (RSS), July 2024
![](./images/han-rss24-InterPreT.png)
InterPreT: Interactive Predicate Learning from Language Feedback for Generalizable Task Planning
Robotics: Science and Systems (RSS), July 2024
![](./images/droid-rss24-droid.png)
DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset
Robotics: Science and Systems (RSS), July 2024
![](./images/nasiriany-rss24-robocasa.jpg)
RoboCasa: Large-Scale Simulation of Everyday Tasks for Generalist Robots
Robotics: Science and Systems (RSS), July 2024
![](./images/zhu-arxiv24-orion.jpg)
ORION: Vision-based Manipulation from Single Human Video with Open-World Object Graphs
Technical report arXiv:2405.20321, May 2024
![](./images/rtx-arxiv23-rtx.png)
Open X-Embodiment: Robotic Learning Datasets and RT-X Models
IEEE International Conference on Robotics and Automation (ICRA), May 2024
Best Conference Paper Award
![](./images/liu-arxiv23-siriusrm.png)
Model-Based Runtime Monitoring with Interactive Imitation Learning
IEEE International Conference on Robotics and Automation (ICRA), May 2024
![](./images/zhenyu-doduo.png)
Doduo: Dense Visual Correspondence from Unsupervised Semantic-Aware Flow
IEEE International Conference on Robotics and Automation (ICRA), May 2024
![](./images/wan-arxiv23-lotus.png)
LOTUS: Continual Imitation Learning for Robot Manipulation Through Unsupervised Skill Discovery
IEEE International Conference on Robotics and Automation (ICRA), May 2024
![](./images/ma-arxiv23-eureka.png)
Eureka: Human-Level Reward Design via Coding Large Language Models
International Conference on Learning Representations (ICLR), May 2024
![](./images/grigsby-iclr23-amago.jpg)
AMAGO: Scalable In-Context Reinforcement Learning for Adaptive Agents
International Conference on Learning Representations (ICLR), May 2024
Spotlight Presentation
![](./images/jiang-arxiv22-forge.png)
Few-View Object Reconstruction with Unknown Categories and Camera Poses
International Conference on 3D Vision (3DV), March 2024
Oral Presentation
![](./images/chuck-tmlr24-granger.png)
Granger Causal Interaction Skill Chains
Transactions on Machine Learning Research (TMLR), March 2024
![](./images/wang-tmlr24-voyager.png)
Voyager: An Open-Ended Embodied Agent with Large Language Models
Transactions on Machine Learning Research (TMLR), March 2024
![](./images/gao-arxiv24-prime.png)
PRIME: Scaffolding Manipulation Tasks with Behavior Primitives for Data-Efficient Imitation Learning
Technical report arXiv:2403.00929, March 2024
![](./images/wang-aaai24-building.png)
Building Minimal and Reusable Causal State Abstractions for Reinforcement Learning
AAAI Conference on Artificial Intelligence (AAAI), February 2024
Oral Presentation
2023
![](./images/firoozi-arxiv23-fomo.png)
Foundation Models in Robotics: Applications, Challenges, and the Future
Technical report arXiv:2312.07843, December 2023
![](./images/seo-arxiv23-trill.png)
Deep Imitation Learning for Humanoid Loco-manipulation through Human Teleoperation
International Conference on Humanoid Robots (Humanoids), December 2023
Oral Presentation
![](./images/shi-neurips23-cec.png)
Cross-Episodic Curriculum for Transformer Agents
Conference on Neural Information Processing Systems (NeurIPS), December 2023
![](./images/liu-neurips23-libero.png)
LIBERO: Benchmarking Knowledge Transfer in Lifelong Robot Learning
NeurIPS 2023 Datasets and Benchmarks Track, December 2023
![](./images/yang-emnlp23-revilm.png)
Re-ViLM: Retrieval-Augmented Visual Language Model for Zero and Few-Shot Image Captioning
Conference on Empirical Methods in Natural Language Processing (EMNLP), December 2023
![](./images/zhu-corl-groot.png)
Learning Generalizable Manipulation Policies with Object-Centric 3D Representations
Conference on Robot Learning (CoRL), November 2023
![](./images/wang-corl23-mimicplay.png)
MimicPlay: Long-Horizon Imitation Learning by Watching Human Play
Conference on Robot Learning (CoRL), November 2023
Best Paper Award Finalist
![](./images/shah-corl23-mutex.jpg)
MUTEX: Learning Unified Policies from Multimodal Task Specifications
Conference on Robot Learning (CoRL), November 2023
![](./images/mandlekar-corl23-mimicgen.png)
MimicGen: A Data Generation System for Scalable Robot Learning using Human Demonstrations
Conference on Robot Learning (CoRL), November 2023
![](./images/liu-arxiv23-olaf.png)
Interactive Robot Learning from Verbal Correction
Technical report arXiv:2310.17555, October 2023
![](./images/zhang-iros23-s3o.png)
Symbolic State Space Optimization for Long Horizon Mobile Manipulation Planning
International Conference on Intelligent Robots and Systems (IROS), October 2023
![](./images/shen-rss22-acid.png)
ACID: Action-Conditional Implicit Visual Dynamics for Deformable Object Manipulation
International Journal of Robotics Research (IJRR), July 2023
![](./images/jiang-icml23-vima.png)
VIMA: General Robot Manipulation with Multimodal Prompts
International Conference on Machine Learning (ICML), July 2023
![](./images/liu-rss23-sirius.png)
Robot Learning on the Job: Human-in-the-Loop Autonomy and Learning During Deployment
Robotics: Science and Systems (RSS), July 2023
Best Paper Award Finalist
![](./images/dong-cvpr-monosparse.png)
Fast Monocular Scene Reconstruction with Global-Sparse Local-Dense Grids
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2023
![](./images/hsu-icra23-dittohouse.jpg)
Ditto in the House: Building Articulated Models of Indoor Scenes through Interactive Perception
IEEE International Conference on Robotics and Automation (ICRA), May 2023
![](./images/seo-arxiv22-prelude.png)
Learning to Walk by Steering: Perceptive Quadrupedal Locomotion in Dynamic Environments
IEEE International Conference on Robotics and Automation (ICRA), May 2023
2022
![](./images/nasiriany-corl22-sailor.png)
Learning and Retrieval from Prior Data for Skill-based Imitation Learning
Conference on Robot Learning (CoRL), December 2022
![](./images/zhu-corl22-viola.png)
VIOLA: Imitation Learning for Vision-Based Manipulation with Object Proposal Priors
Conference on Robot Learning (CoRL), December 2022
![](./images/fan-neurips22-minedojo.jpg)
MineDojo: Building Open-Ended Embodied Agents with Internet-Scale Knowledge
NeurIPS 2022 Datasets and Benchmarks Track, November 2022
Outstanding Paper Award
![](./images/li-neurips22-ptlm.jpg)
Pre-Trained Language Models for Interactive Decision-Making
Conference on Neural Information Processing Systems (NeurIPS), November 2022
Oral Presentation
![](./images/wang-icml22-cdl.png)
Causal Dynamics Learning for Task-Independent State Abstraction
International Conference on Machine Learning (ICML), July 2022
Long Presentation
![](./images/shen-rss22-acid.png)
ACID: Action-Conditional Implicit Visual Dynamics for Deformable Object Manipulation
Robotics: Science and Systems (RSS), June 2022
Best Student Paper Award Finalist
![](./images/jiang-cvpr22-ditto.jpg)
Ditto: Building Digital Twins of Articulated Objects from Interaction
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2022
Oral Presentation
![](./images/cui-cvpr22-coopernaut.jpg)
COOPERNAUT: End-to-End Driving with Cooperative Perception for Networked Vehicles
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2022
![](./images/jiang-cvpr22-bongard-hoi.png)
Bongard-HOI: Benchmarking Few-Shot Visual Reasoning for Human-Object Interactions
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2022
Oral Presentation
![](./images/nasiriany-icra22-maple.png)
Augmenting Reinforcement Learning with Behavior Primitives for Diverse Manipulation Tasks
IEEE International Conference on Robotics and Automation (ICRA), May 2022
Outstanding Learning Paper Award
![](./images/wong-icra22-oscar.png)
OSCAR: Data-Driven Operational Space Control for Adaptive and Robust Robot Manipulation
IEEE International Conference on Robotics and Automation (ICRA), May 2022
![](./images/zhang-icra22-visually.png)
Visually Grounded Task and Motion Planning for Mobile Manipulation
IEEE International Conference on Robotics and Automation (ICRA), May 2022
![](./images/ma-iclr2022-relvit.png)
RelViT: Concept-Guided Vision Transformer for Visual Relational Reasoning
International Conference on Learning Representations (ICLR), April 2022
![](./images/zhu-ral22-buds.png)
Bottom-Up Skill Discovery from Unsegmented Demonstrations for Long-Horizon Robot Manipulation
IEEE Robotics and Automation Letters (RA-L), January 2022
2021
![](./images/lee-corl21-chaining.png)
Adversarial Skill Chaining for Long-Horizon Robot Manipulation via Terminal State Regularization
Conference on Robot Learning (CoRL), November 2021
![](./images/mandlekar-corl21-offline.png)
What Matters in Learning from Offline Human Demonstrations for Robot Manipulation
Conference on Robot Learning (CoRL), November 2021
Oral Presentation
![](./images/lan-iccv21-discobox.png)
DiscoBox: Weakly Supervised Instance Segmentation and Semantic Correspondence from Box Supervision
International Conference on Computer Vision (ICCV), October 2021
![](./images/fang-rss21-slide.png)
Learning Generalizable Skills via Automated Generation of Diverse Tasks
Robotics: Science and Systems (RSS), July 2021
![](./images/jiang-rss21-giga.png)
Synergies Between Affordance and Geometry: 6-DoF Grasp Detection via Implicit Representations
Robotics: Science and Systems (RSS), July 2021
![](./images/fan-icml21-secant.png)
SECANT: Self-Expert Cloning for Zero-Shot Generalization of Visual Policies
International Conference on Machine Learning (ICML), July 2021
![](./images/liu-icml21-copa.png)
Coach-Player Multi-Agent Reinforcement Learning for Dynamic Team Composition
International Conference on Machine Learning (ICML), July 2021
Long Talk
![](./images/mahajan-icml21-tesseract.png)
Tesseract: Tensorised Actors for Multi-Agent Reinforcement Learning
International Conference on Machine Learning (ICML), July 2021
![](./images/liang-neurips2021-multibench.png)
MultiBench: Multiscale Benchmarks for Multimodal Representation Learning
NeurIPS 2021 Datasets and Benchmarks Track, July 2021
![](./images/shi-icra21-uncertainty.png)
Fast Uncertainty Quantification for Deep Object Pose Estimation
IEEE International Conference on Robotics and Automation (ICRA), May 2021
![](./images/zhu-icra21-scene-graph.png)
Hierarchical Planning for Long-Horizon Manipulation with Geometric and Symbolic Scene Graphs
IEEE International Conference on Robotics and Automation (ICRA), May 2021
![](./images/xu-icra21-affordance.png)
Deep Affordance Foresight: Planning Through What Can Be Done in the Future
IEEE International Conference on Robotics and Automation (ICRA), May 2021
![](./images/lee-icra21-crossmodal.png)
Detect, Reject, Correct: Crossmodal Compensation of Corrupted Sensors
IEEE International Conference on Robotics and Automation (ICRA), May 2021
![](./images/pan-icra21-morphology.png)
Emergent Hand Morphology and Control from Optimizing Robust Grasps of Diverse Objects
IEEE International Conference on Robotics and Automation (ICRA), May 2021
![](./images/tung-icra21-mart.png)
Learning Multi-Arm Manipulation Through Collaborative Teleoperation
IEEE International Conference on Robotics and Automation (ICRA), May 2021
Best Multi-Robotic Systems Paper Award Finalist
![](./images/fang-iclr21-apt-gen.png)
Adaptive Procedural Task Generation for Hard-Exploration Problems
International Conference on Learning Representations (ICLR), May 2021
2020
![](./images/mandlekar-arxiv20-hitl.png)
Human-in-the-Loop Imitation Learning using Remote Teleoperation
Technical report arXiv:2012.06733, December 2020
![](./images/nie-neurips20-bongard-logo.png)
Bongard-LOGO: A New Benchmark for Human-Level Concept Learning and Reasoning
Conference on Neural Information Processing Systems (NeurIPS), December 2020
Spotlight Presentation
![](./images/da-corl20-locomotion.png)
Learning a Contact-Adaptive Controller for Robust, Efficient Legged Locomotion
Conference on Robot Learning (CoRL), November 2020
![](./images/zhu-arxiv20-robosuite.png)
robosuite: A Modular Simulation Framework and Benchmark for Robot Learning
Technical report arXiv:2009.12293, September 2020
![](./images/fan-eccv20-rubiksnet.png)
RubiksNet: Learnable 3D-Shift for Efficient Video Action Recognition
European Conference on Computer Vision (ECCV), August 2020
* indicates equal contribution
![](./images/ren-uai20-ocean.png)
OCEAN: Online Task Inference for Compositional Tasks with Context Adaptation
Conference on Uncertainty in Artificial Intelligence (UAI), August 2020
![](./images/wang-ijcai20-dual.png)
DualSMC: Tunneling Differentiable Filtering and Planning under Continuous POMDPs
International Joint Conference on Artificial Intelligence (IJCAI), July 2020
* indicates equal contribution
![](./images/wang-icra20-6pack.png)
6-PACK: Category-Level 6D Pose Tracker with Anchor-Based Keypoints
IEEE International Conference on Robotics and Automation (ICRA), May 2020
![](./images/qin-icra20.png)
KETO: Learning Keypoint Representations for Tool Manipulation
IEEE International Conference on Robotics and Automation (ICRA), May 2020
![](./images/lee-tro20-making.png)
Making Sense of Vision and Touch: Learning Multimodal Representations for Contact-Rich Tasks
IEEE Transactions on Robotics (T-RO), March 2020