CoRL 2024 是机器人学习会议(Conference on Robot Learning),是聚焦于机器人学与机器学习交叉领域的年度国际会议。CoRL2024 的成果涉及多个方面,包括机器人手部触觉、机器人移动操作、运动控制与学习、导航、人机交互、远程操作等领域。- 在手部触觉方面,有展示学习人工智能实现的灵巧操作;拟人化机器人手的远程操作;实现重力不变的手持物体旋转并结合模拟到真实的触摸技术;在开源灵巧机器人手上展示从人类学习;基于软视觉的指尖触觉感知;多功能机器人操作远程操作系统;通过手持声学振动实现物体感知;混合软刚性机器人平台通过演示学习可推广技能;通过视觉触觉传感学习精细操作。
- 在移动操作方面,有展示在动态环境中的长视距移动操作及零样本、随处部署的操作策略,还有开源全向移动操作机器人用于机器人学习,以及实现边缘设备上自主移动机器人的实时、稳健 3D 映射、导航和语义分割。
- 在运动控制与学习方面,有通过离线数据集的扩散实现实时腿部运动控制;人形机器人的跑步和跳跃;四足机器人的零样本安全以及在具有挑战性的不连续地形上敏捷跳跃。
- 在人机交互方面,有多功能仿生类人机器人头部用于沉浸式人机交互。
- 在远程操作方面,有通过沉浸式增强现实实现伸展控制;用于富有表现力的全臂远程操作的运动学重定向算法;具有沉浸式主动视觉反馈的远程操作。
本期整理了CoRL 2024部分被接收的论文,按照研究方向分类。随我们一起看看吧!
人形机器人
OKAMI: Teaching Humanoid Robots Manipulation Skills through Single Video lmitation, https://arxiv.org/abs/2410.11792Humanoid Parkour Learning, https://arxiv.org/abs/2406.10759Adapting Humanoid Locomotion over Challenging Terrain via Two-Phase Training , https://openreview.net/attachment?id=O0oK2bVist&name=pdf
机器人学习、规划
Theia: Distilling Diverse Vision Foundation Models for Robot Learning , https://arxiv.org/pdf/2407.20179BodyTransformer:Leveraging RobotEmbodimentforPolicyLearning , https://openreview.net/pdf?id=Oce2215aJEGameplay Filters: Robust Zero-Shot Safety through Adversarial Imagination , https://openreview.net/pdf?id=Ke5xrnBFARLearning to Walk from Three Minutes of Real-World Data with Semi-structured Dynamics Models , https://openreview.net/pdf?id=evCXwlCMIiTowards Open-World Grasping with Large Vision-Language Models , https://openreview.net/pdf?id=QUzwHYJ9HfSafe Bayesian Optimization for the Control of High-Dimensional Embodied Systems , https://openreview.net/pdf?id=8PcRynpd1mLeLaN: Learning A Language-Conditioned Navigation Policy from In-the-Wild Videos , https://openreview.net/pdf?id=zIWu9KmlqkTrajectory Improvement and Reward Learning from Comparative Language Feedback, https://openreview.net/pdf?id=1tCteNSbFHPolicy Adaptation via Language Optimization: Decomposing Tasks for Few-Shot Imitation, https://openreview.net/forum?id=qUSa3F79amLearning Transparent Reward Models via Unsupervised Feature Selection , https://openreview.net/pdf?id=2sg4PY1W9dMaIL: Improving Imitation Learning with Selective State Space Models, https://openreview.net/pdf?id=IssXUYvVTgBootstrapping Reinforcement Learning with Imitation for Vision-Based Agile Flight , https://openreview.net/forum?id=bt0PX0e4rEAutonomous Improvement of Instruction Following Skills via Foundation Models , https://openreview.net/attachment?id=8Ar8b00GJC&name=pdfRobotic Control via Embodied Chain-of-Thought Reasoning , https://openreview.net/pdf?id=S70MgnIA0vScaling Cross-Embodied Learning: One Policy for Manipulation, Navigation, Locomotion and Aviation , https://openreview.net/attachment?id=AuJnXGq3AL&name=pdf
机械臂
DexCatch: Learning to Catch Arbitrary Objects with Dexterous Hands , https://arxiv.org/abs/2310.08809General Flow as Foundation Affordance for Scalable Robot Learning, Chengbo Yuan, Chuan Wen, Tong Zhang, Yang Gao†, https://general-flow.github.io/, CoRL 2024.Leveraging Locality to Boost Sample Efficiency in Robotic Manipulation, Tong Zhang, Yingdong Hu, Jiacheng You, Yang Gao†, https://sgrv2-robot.github.io/, CoRL 2024.HiRT: Enhancing Robotic Control with Hierarchical Robot Transformers, Jianke Zhang∗, Yanjiang Guo∗, Xiaoyu Chen,Yen-Jen Wang, Yucheng Hu, Chengming Shi, Jianyu Chen†, https://arxiv.org/abs/2410.05273, CoRL 2024.Learning to Manipulate Anywhere: A Visual Generalizable Framework For Reinforcement Learning, Zhecheng Yuan*, Tianming Wei*, Shuiqi Cheng, Gu Zhang, Yuanpei Chen, Huazhe Xu†, https://gemcollector.github.io/maniwhere/, CoRL 2024.RiEMann: Near Real-Time SE(3)-Equivariant Robot Manipulation without Point Cloud Segmentation, Chongkai Gao, Zhengrong Xue, Shuying Deng, Tianhai Liang, Siqi Yang, Lin Shao, Huazhe Xu†, https://riemann-web.github.io/, CoRL 2024.ThinkGrasp: A Vision-Language System for Strategic Part Grasping in Clutter , https://arxiv.org/abs/2407.11298ALOHAUnleashed: A Simple Recipe for Robot Dexterity , https://aloha-unleashed.github.io/assets/aloha_unleashed.pdf . 双臂操作。Mobile ALOHA: Learning Bimanual Mobile Manipulation using Low-Cost Whole-Body Teleoperation , https://openreview.net/forum?id=FO6tePGRZj . 双臂操作。RP1M: A Large-Scale Motion Dataset for Piano Playing with Bi-Manual Dexterous Robot Hands , https://openreview.net/attachment?id=4Of4UWyBXE&name=pdf . 双臂操作。DexGraspNet 2.0: Learning Generative Dexterous Grasping in Large-scale Synthetic Cluttered Scenes , https://openreview.net/attachment?id=5W0iZR9J7h&name=pdf
导航
Uncertainty-Aware Decision Transformer for Stochastic Driving Environments,https://arxiv.org/abs/2309.16397InstructNav: Zero-shot System for Generic Instruction Navigation in Unexplored Environment , https://arxiv.org/pdf/2406.04882Context-Aware Replanning with Pre-explored Semantic Map for Object Navigation , https://openreview.net/attachment?id=Dftu4r5jHe&name=pdfLifelong Autonomous Fine-Tuning of Navigation Foundation Models in the Wild, https://openreview.net/attachment?id=vBj5oC60Lk&name=pdf
具身感知
VLM-Grounder: A VLM Agent for Zero-Shot 3D Visual Grounding , https://arxiv.org/pdf/2410.13860 . 3D场景理解,3D视觉定位。GraspSplats: Efficient Manipulation with 3D Feature Splatting , https://arxiv.org/html/2409.02084 .Transferable Tactile Transformers for Representation LearningAcross Diverse Sensors and Tasks, https://arxiv.org/abs/2406.13640D3RoMa: Disparity Diffusion-based Depth Sensing for Material-Agnostic Robotic Manipulation, https://openreview.net/attachment?id=7E3JAys1xO&name=pdfLiDARGrid: Self-supervised 3D Opacity Grid from LiDAR for Scene Forecasting, https://openreview.net/attachment?id=MfuzopqVOX&name=pdf
自动驾驶运动规划
DriveVLM: The Convergence of Autonomous Driving and Large Vision-Language Models , https://arxiv.org/abs/2402.12289 . 用于场景描述,场景分析和分层规划。Uncertainty-Aware Decision Transformer for Stochastic Driving Environments , https://arxiv.org/abs/2309.16397 . 提出了 UNREST,一种针对随机驾驶环境的规划方法。Hint-AD: Holistically Aligned Interpretability in End-to-End Autonomous Driving , https://arxiv.org/pdf/2409.06702
机器人操作
OKAMI: Teaching Humanoid Robots Manipulation Skills through Single Video lmitation, https://arxiv.org/abs/2410.11792Reinforcement Learning with Foundation Priors: Let the Embodied Agent Efficiently Learn on Its Own , https://arxiv.org/abs/2310.02635General Flow as Foundation Affordance for Scalable Robot Learning , https://arxiv.org/abs/2401.11439A Universal Semantic-Geometric Representation for Robotic Manipulation , https://arxiv.org/abs/2306.10474Learning to Manipulate Anywhere: A Visual Generalizable Framework For Reinforcement Learning, Zhecheng Yuan*, Tianming Wei*, Shuiqi Cheng, Gu Zhang, Yuanpei Chen, Huazhe Xu†, https://gemcollector.github.io/maniwhere/, CoRL 2024.Learning to Manipulate Anywhere: A Visual Generalizable Framework For Reinforcement Learning , https://arxiv.org/abs/2407.15815GenSim2: Scaling Robot Data Generation with Multi-modal and Reasoning LLMs , https://arxiv.org/abs/2410.03645RiEMann: Near Real-Time SE(3)-Equivariant Robot Manipulation without Point Cloud Segmentation , https://arxiv.org/abs/2403.19460RoboGolf: Mastering Real-World Minigolf with a Reflective Multi-Modality Vision-Language Model , https://arxiv.org/abs/2406.10157Continuously Improving Mobile Manipulation with Autonomous Real-World RL , https://continual-mobile-manip.github.io/resources/paper.pdfImplicit Grasp Diffusion: Bridging the Gap between Dense Prediction and Sampling-based Grasping , https://openreview.net/pdf?id=VUhlMfEekmAnOpen-Source Soft Robotic Platform for Autonomous Aerial Manipulation in the Wild , https://openreview.net/pdf?id=SfaB20rjVo
文章来源:具身智能之心