Recent advances in aerial robotics have enabled the use of multirotor vehicles for autonomous payload transportation. Resorting only to classical methods to reliably model a quadrotor carrying a cable-slung load poses significant challenges. On the other hand, purely data-driven learning methods do not comply by design with the problem's physical constraints, especially in states that are not densely represented in training data. In this work, we explore the use of physics informed neural networks to learn an end-to-end model of the multirotor-slung-load system and, at a given time, estimate a sequence of the future system states. An LSTM encoder decoder with an attention mechanism is used to capture the dynamics of the system. To guarantee the cohesiveness between the multiple predicted states of the system, we propose the use of a physics-based term in the loss function, which includes a discretized physical model derived from first principles together with slack variables that allow for a small mismatch between expected and predicted values. To train the model, a dataset using a real-world quadrotor carrying a slung load was curated and is made available. Prediction results are presented and corroborate the feasibility of the approach. The proposed method outperforms both the first principles physical model and a comparable neural network model trained without the physics regularization proposed.
近年来,随着无人机遥控技术的进步,多旋翼车辆已被用于自主承载电缆吊重物的运输。仅仅依靠经典方法来可靠地建模四旋翼携带电缆吊重物存在重大挑战。另一方面,完全基于数据驱动的学习方法在设计上并不符合问题固有约束,尤其是在训练数据中没有很好地表示的状态。在本文中,我们探讨了使用受物理学启发的神经网络来学习多旋翼吊重系统端到端模型的应用,并在给定时间估计未来系统状态。为了捕捉系统的动态,我们使用了LSTM编码器-解码器模型,并引入了注意机制来控制多个预测状态之间的连贯性。为了保证系统中多个预测状态的连贯性,我们在损失函数中引入了一个基于物理学的项,包括从基本原理导出的离散化物理模型和允许预期值和预测值之间的小误差的可缩放变量。为了训练模型,我们挑选了一个使用真实世界四旋翼运输电缆吊重物的数据集,并提供了用于训练的数据集。预测结果被呈现,并证实了该方法的有效性。与无物理学 regularization的第一原理物理模型和没有物理学的神经网络模型相比,所提出的方法优越。
https://arxiv.org/abs/2405.09428
This research reports VascularPilot3D, the first 3D fully autonomous endovascular robot navigation system. As an exploration toward autonomous guidewire navigation, VascularPilot3D is developed as a complete navigation system based on intra-operative imaging systems (fluoroscopic X-ray in this study) and typical endovascular robots. VascularPilot3D adopts previously researched fast 3D-2D vessel registration algorithms and guidewire segmentation methods as its perception modules. We additionally propose three modules: a topology-constrained 2D-3D instrument end-point lifting method, a tree-based fast path planning algorithm, and a prior-free endovascular navigation strategy. VascularPilot3D is compatible with most mainstream endovascular robots. Ex-vivo experiments validate that VascularPilot3D achieves 100% success rate among 25 trials. It reduces the human surgeon's overall control loops by 18.38%. VascularPilot3D is promising for general clinical autonomous endovascular navigations.
这项研究报道了VascularPilot3D,这是第一个3D完全自主式内窥镜导航系统。作为自主引导线导航探索,VascularPilot3D是基于内窥镜成像系统(本研究中的荧光X射线)和典型内窥镜机器人开发的完整导航系统。VascularPilot3D采用之前研究过的快速3D-2D血管配准算法和引导线分割方法作为其感知模块。此外,我们还提出了三个模块:基于树的高速路径规划算法、基于约束的2D-3D器械端点提升方法和无需先验的内窥镜导航策略。VascularPilot3D兼容大多数主流内窥镜机器人。实验验证表明,VascularPilot3D在25个试点研究中实现了100%的成功率。它减少了人类外科医生的总操作循环次数 by 18.38%。VascularPilot3D在一般临床自主内窥镜导航方面具有前景。
https://arxiv.org/abs/2405.09375
Current orthopedic robotic systems largely focus on navigation, aiding surgeons in positioning a guiding tube but still requiring manual drilling and screw placement. The automation of this task not only demands high precision and safety due to the intricate physical interactions between the surgical tool and bone but also poses significant risks when executed without adequate human oversight. As it involves continuous physical interaction, the robot should collaborate with the surgeon, understand the human intent, and always include the surgeon in the loop. To achieve this, this paper proposes a new cognitive human-robot collaboration framework, including the intuitive AR-haptic human-robot interface, the visual-attention-based surgeon model, and the shared interaction control scheme for the robot. User studies on a robotic platform for orthopedic surgery are presented to illustrate the performance of the proposed method. The results demonstrate that the proposed human-robot collaboration framework outperforms full robot and full human control in terms of safety and ergonomics.
目前,机器人骨科系统主要关注导航,帮助医生在定位引导管时进行操作,但仍需要手动进行钻孔和螺栓植入。自动化这一任务不仅要求高精度和安全性,是由于手术工具与骨头的复杂物理相互作用所带来的,而且在缺乏充分人类监督的情况下执行也存在重大风险。由于涉及持续的身体交互,机器人应与医生合作,理解人类的意图,并始终将医生纳入循环。为实现这一目标,本文提出了一种新的人机协作框架,包括直观的AR-人机界面、基于视觉注意的医生模型和机器人共享交互控制方案。用户研究在骨科手术机器人平台上展示了所提出方法的有效性。结果表明,与全机器人控制和全人类控制相比,人机协作框架在安全和人机工程方面具有优势。
https://arxiv.org/abs/2405.09359
One goal of dexterous robotic grasping is to allow robots to handle objects with the same level of flexibility and adaptability as humans. However, it remains a challenging task to generate an optimal grasping strategy for dexterous hands, especially when it comes to delicate manipulation and accurate adjustment the desired grasping poses for objects of varying shapes and sizes. In this paper, we propose a novel dexterous grasp generation scheme called \textbf{\textit{GrainGrasp}} that provides fine-grained contact guidance for each fingertip. In particular, we employ a generative model to predict separate contact maps for each fingertip on the object point cloud, effectively capturing the specifics of finger-object interactions. In addition, we develop a new dexterous grasping optimization algorithm that solely relies on the point cloud as input, eliminating the necessity for complete mesh information of the object. By leveraging the contact maps of different fingertips, the proposed optimization algorithm can generate precise and determinable strategies for human-like object grasping. Experimental results confirm the efficiency of the proposed scheme. Our code is available at this https URL
灵巧机器人抓取的一个目标是使机器人能够像人类一样处理具有相同程度的灵活性和适应性的物体。然而,为灵巧的手生成最优抓取策略仍然是一个具有挑战性的任务,尤其是在处理形状和大小不等的物体时,更是如此。在本文中,我们提出了一个名为 \textbf{\textit{GrainGrasp}} 的新颖灵巧抓取生成方案,为每个手指提供细粒度的接触指导。 特别是,我们采用生成模型预测物体点云上每个手指的单独接触图,有效捕捉了手指与物体之间互动的特定细节。此外,我们还开发了一种仅依赖点云的灵巧抓取优化算法,消除了需要物体完整网格信息的必要性。通过利用不同手指的接触图,所提出的优化算法可以生成人类式物体抓取的精确和可确定策略。实验结果证实了所提出方案的有效性。我们的代码可以从该链接获取:
https://arxiv.org/abs/2405.09310
In this paper, we present an innovative technique for the path planning of flying robots in a 3D environment in Rough Mereology terms. The main goal was to construct the algorithm that would generate the mereological potential fields in 3-dimensional space. To avoid falling into the local minimum, we assist with a weighted Euclidean distance. Moreover, a searching path from the start point to the target, with respect to avoiding the obstacles was applied. The environment was created by connecting two cameras working in real-time. To determine the gate and elements of the world inside the map was responsible the Python Library OpenCV [1] which recognized shapes and colors. The main purpose of this paper is to apply the given results to drones.
在本文中,我们提出了一种创新的方法,用于在 rough melee 环境下对飞行机器人的路径进行规划。主要目标是为 3D 空间中的飞行机器人生成只论域 potential fields。为了避免陷入局部最小值,我们使用加权欧氏距离来协助算法。此外,我们还应用了从起点到目标点的搜索路径,以避免障碍物。环境是由实时连接的两个相机创建的。确定地图内世界的门和元素的是 Python 库 OpenCV [1],它识别形状和颜色。本文的主要目的是将所得到的结果应用于无人机。
https://arxiv.org/abs/2405.09282
Model Predictive Control (MPC)-based trajectory planning has been widely used in robotics, and incorporating Control Barrier Function (CBF) constraints into MPC can greatly improve its obstacle avoidance efficiency. Unfortunately, traditional optimizers are resource-consuming and slow to solve such non-convex constrained optimization problems (COPs) while learning-based methods struggle to satisfy the non-convex constraints. In this paper, we propose SOMTP algorithm, a self-supervised learning-based optimizer for CBF-MPC trajectory planning. Specifically, first, SOMTP employs problem transcription to satisfy most of the constraints. Then the differentiable SLPG correction is proposed to move the solution closer to the safe set and is then converted as the guide policy in the following training process. After that, inspired by the Augmented Lagrangian Method (ALM), our training algorithm integrated with guide policy constraints is proposed to enable the optimizer network to converge to a feasible solution. Finally, experiments show that the proposed algorithm has better feasibility than other learning-based methods and can provide solutions much faster than traditional optimizers with similar optimality.
基于模型的预测控制(MPC)路径规划在机器人领域得到了广泛应用,并将控制障碍功能(CBF)约束融入MPC可以大大提高其避障效率。然而,传统的优化器在处理非凸约束优化问题(COPs)时资源消耗大、解决速度慢。基于学习的方法也难以满足非凸约束。在本文中,我们提出了SOMTP算法,一种基于自我监督学习的自适应CBF-MPC路径规划优化器。具体来说,SOMTP首先采用问题变换来满足大多数约束。然后,提出了不同可导的SLPG校正来将解决方案更接近安全集,接着在训练过程中将其转换为引导策略。此外,受到增广拉格朗日方法(ALM)的启发,我们提出了一种与引导策略约束相结合的训练算法,使优化器网络能够收敛到可行解。最后,实验证明,与其它学习方法相比,该算法具有更好的可行性,并提供比传统具有相似最优性的优化器更快的解决方案。
https://arxiv.org/abs/2405.09212
-Recent strides in model predictive control (MPC)underscore a dependence on numerical advancements to efficientlyand accurately solve large-scale problems. Given the substantialnumber of variables characterizing typical whole-body optimalcontrol (OC) problems -often numbering in the thousands-exploiting the sparse structure of the numerical problem becomescrucial to meet computational demands, typically in the range ofa few milliseconds. A fundamental building block for computingNewton or Sequential Quadratic Programming (SQP) steps indirect optimal control methods involves addressing the linearquadratic regulator (LQR) problem. This paper concentrateson equality-constrained problems featuring implicit systemdynamics and dual regularization, a characteristic found inadvanced interior-point or augmented Lagrangian solvers. Here,we introduce a parallel algorithm designed for solving an LQRproblem with dual regularization. Leveraging a rewriting of theLQR recursion through block elimination, we first enhanced theefficiency of the serial algorithm, then subsequently generalized itto handle parametric problems. This extension enables us to splitdecision variables and solve multiple subproblems concurrently.Our algorithm is implemented in our nonlinear numerical optimalcontrol library ALIGATOR. It showcases improved performanceover previous serial formulations and we validate its efficacy bydeploying it in the model predictive control of a real quadrupedrobot. This paper follows up from our prior work on augmentedLagrangian methods for numerical optimal control with implicitdynamics and constraints.
近年来,模型预测控制(MPC)的进步表明,要高效准确地解决大规模问题,需要依赖数值改进。对于典型全身最优控制(OC)问题中大量存在的变量,通常有几千个,利用数值问题的稀疏结构变得至关重要,通常需要花费计算资源的毫秒级。计算新牛顿或序贯四元规划(SQP)步的直接最优控制方法的基本构建模块涉及解决线性二次调节器(LQR)问题。本文重点讨论具有隐含系统动力学和支持的等式约束问题,这是高级内部点或增强型拉格朗日求解器中发现的特征。在这里,我们介绍了一种用于求解具有双重 regularization 的 LQR 问题的并行算法。通过通过块消除重写LQR递归,我们首先增强了序列算法的效率,然后随后扩展到处理参数问题。这个扩展使得我们可以同时划分决策变量并解决多个子问题。 我们的算法实现在我们非线性数值最优控制库 ALIGATOR 中。它展示了前述序列形式的改进性能,并通过将该算法应用于实际四足机器人的模型预测控制来验证其有效性。本文接着我们在之前的关于具有隐含动力学和支持的增广拉格朗日方法的研究工作。
https://arxiv.org/abs/2405.09197
In this article, we focus on the critical tasks of plant protection in arable farms, addressing a modern challenge in agriculture: integrating ecological considerations into the operational strategy of precision weeding robots like \bbot. This article presents the recent advancements in weed management algorithms and the real-world performance of \bbot\ at the University of Bonn's Klein-Altendorf campus. We present a novel Rolling-view observation model for the BonnBot-Is weed monitoring section which leads to an average absolute weeding performance enhancement of $3.4\%$. Furthermore, for the first time, we show how precision weeding robots could consider bio-diversity-aware concerns in challenging weeding scenarios. We carried out comprehensive weeding experiments in sugar-beet fields, covering both weed-only and mixed crop-weed situations, and introduced a new dataset compatible with precision weeding. Our real-field experiments revealed that our weeding approach is capable of handling diverse weed distributions, with a minimal loss of only $11.66\%$ attributable to intervention planning and $14.7\%$ to vision system limitations highlighting required improvements of the vision system.
在本文中,我们重点讨论了农田保护任务中的关键任务,解决了农业领域的一个现代挑战:将生态考虑因素整合到像\bbot这样的精确喷雾机器人操作策略中。本文介绍了喷雾管理算法的最新进展以及\bbot\在 Bonn 大学 Klein-Altendorf 校园的实地表现。我们提出了 BonnBot-Is 杂草监测部分的滚动查看观察模型,使得平均绝对喷雾性能提高了 3.4%。此外,我们还展示了精确喷雾机器人如何考虑挑战性喷雾场景中的生物多样性关注。我们在糖菜田进行了全面的喷雾实验,涵盖了只有杂草和混合种植作物的情况,并引入了一个与精确喷雾兼容的新数据集。我们的实地实验表明,我们的喷雾方法能够处理不同的杂草分布,干预计划的损失只有 11.66%,而视觉系统限制引起的损失为 14.7%。
https://arxiv.org/abs/2405.09118
To safely navigate intricate real-world scenarios, autonomous vehicles must be able to adapt to diverse road conditions and anticipate future events. World model (WM) based reinforcement learning (RL) has emerged as a promising approach by learning and predicting the complex dynamics of various environments. Nevertheless, to the best of our knowledge, there does not exist an accessible platform for training and testing such algorithms in sophisticated driving environments. To fill this void, we introduce CarDreamer, the first open-source learning platform designed specifically for developing WM based autonomous driving algorithms. It comprises three key components: 1) World model backbone: CarDreamer has integrated some state-of-the-art WMs, which simplifies the reproduction of RL algorithms. The backbone is decoupled from the rest and communicates using the standard Gym interface, so that users can easily integrate and test their own algorithms. 2) Built-in tasks: CarDreamer offers a comprehensive set of highly configurable driving tasks which are compatible with Gym interfaces and are equipped with empirically optimized reward functions. 3) Task development suite: This suite streamlines the creation of driving tasks, enabling easy definition of traffic flows and vehicle routes, along with automatic collection of multi-modal observation data. A visualization server allows users to trace real-time agent driving videos and performance metrics through a browser. Furthermore, we conduct extensive experiments using built-in tasks to evaluate the performance and potential of WMs in autonomous driving. Thanks to the richness and flexibility of CarDreamer, we also systematically study the impact of observation modality, observability, and sharing of vehicle intentions on AV safety and efficiency. All code and documents are accessible on this https URL.
为了在复杂的现实场景中安全导航,自动驾驶车辆必须能够适应各种道路条件并预测未来事件。基于强化学习的(RL)世界模型(WM)作为一种有前景的方法,通过学习和预测各种环境中的复杂动态而 emergence。然而,据我们所知,目前没有可用的平台来训练和测试这种算法在复杂驾驶环境中的自动驾驶算法。为填补这一空白,我们介绍了CarDreamer,第一个专为开发基于RL的自驾算法而设计的开源学习平台。它包括三个关键组件:1)世界模型骨架:CarDreamer集成了一些最先进的WMs,简化了RL算法的复制。骨架与其余部分解耦并使用标准的Gym界面通信,以便用户轻松地将自己的算法集成和测试。2)内置任务:CarDreamer提供了一系列高度可配置的驾驶任务,与Gym接口兼容,并配备经过实证优化的奖励函数。3)任务开发套件:该套件简化了驾驶任务的创建,用户可以轻松定义交通流量和车辆路线,并自动收集多模态观察数据。可视化服务器允许用户通过浏览器追踪实时代理驾驶员的视频和性能指标。此外,我们使用内置任务对WMs在自动驾驶中的性能和潜力进行了广泛的实验评估。由于CarDreamer的丰富性和灵活性,我们还系统地研究了观测模式、可观测性和车辆意图共享对AV安全性和效率的影响。所有代码和文档都可以在https://这个URL访问。
https://arxiv.org/abs/2405.09111
Humans use collaborative robots as tools for accomplishing various tasks. The interaction between humans and robots happens in tight shared workspaces. However, these machines must be safe to operate alongside humans to minimize the risk of accidental collisions. Ensuring safety imposes many constraints, such as reduced torque and velocity limits during operation, thus increasing the time to accomplish many tasks. However, for applications such as using collaborative robots as haptic interfaces with intermittent contacts for virtual reality applications, speed limitations result in poor user experiences. This research aims to improve the efficiency of a collaborative robot while improving the safety of the human user. We used Gaussian process models to predict human hand motion and developed strategies for human intention detection based on hand motion and gaze to improve the time for the robot and human security in a virtual environment. We then studied the effect of prediction. Results from comparisons show that the prediction models improved the robot time by 3\% and safety by 17\%. When used alongside gaze, prediction with Gaussian process models resulted in an improvement of the robot time by 2\% and the safety by 13\%.
人类使用协作机器人作为完成各种任务的工具。人类和机器人之间的互动发生在紧密共享的工作空间中。然而,为了最小化意外碰撞的风险,这些机器必须安全地与人类一起操作。确保安全性会带来许多限制,例如在操作期间减小扭矩和速度限制,从而增加完成许多任务的所需时间。然而,对于将协作机器人用作虚拟现实应用中的触觉接口的应用,速度限制会导致用户体验差。这项研究旨在提高协作机器人的效率,同时提高人类用户的可靠性。我们使用高斯过程模型预测人类手部运动,并基于手部动作和眼神来开发了人类意图检测策略,以提高机器人和人类在虚拟环境中的安全时间。然后我们研究了预测的影响。比较结果表明,预测模型提高了机器人的时间3%,安全性提高了17%。当与眼神结合使用时,使用高斯过程模型的预测提高了机器人的时间2%,安全性提高了13%。
https://arxiv.org/abs/2405.09109
The discovery of linear embedding is the key to the synthesis of linear control techniques for nonlinear systems. In recent years, while Koopman operator theory has become a prominent approach for learning these linear embeddings through data-driven methods, these algorithms often exhibit limitations in generalizability beyond the distribution captured by training data and are not robust to changes in the nominal system dynamics induced by intrinsic or environmental factors. To overcome these limitations, this study presents an adaptive Koopman architecture capable of responding to the changes in system dynamics online. The proposed framework initially employs an autoencoder-based neural network that utilizes input-output information from the nominal system to learn the corresponding Koopman embedding offline. Subsequently, we augment this nominal Koopman architecture with a feed-forward neural network that learns to modify the nominal dynamics in response to any deviation between the predicted and observed lifted states, leading to improved generalization and robustness to a wide range of uncertainties and disturbances compared to contemporary methods. Extensive tracking control simulations, which are undertaken by integrating the proposed scheme within a Model Predictive Control framework, are used to highlight its robustness against measurement noise, disturbances, and parametric variations in system dynamics.
线性嵌入的发现是 nonlinear系统线性控制技术合成的关键。近年来,虽然Koopman操作子理论通过数据驱动方法学习这些线性嵌入取得了突出地位,但这些算法在泛化能力上常常存在局限性,不仅限于训练数据所捕获的分布,而且对由内生或环境因素引起的拟合系统动态变化不具有鲁棒性。为了克服这些限制,本研究提出了一个自适应Koopman架构,能够在线系统动态变化发生时响应变化。所提出的框架首先采用了一个基于自动编码器的神经网络,利用名义系统的输入-输出信息来学习相应的Koopman嵌入。随后,我们通过一个前馈神经网络来增强这种名义Koopman架构,使其能够根据预测和观察到的抬升状态对名义动态进行修改,从而提高泛化能力和对当代方法的鲁棒性。为了检验这种方法对测量噪声、干扰和系统动态参数变化等的鲁棒性,我们在Model预测控制框架中进行了广泛的跟踪控制仿真。
https://arxiv.org/abs/2405.09101
Sensor placement optimization methods have been studied extensively. They can be applied to a wide range of applications, including surveillance of known environments, optimal locations for 5G towers, and placement of missile defense systems. However, few works explore the robustness and efficiency of the resulting sensor network concerning sensor failure or adversarial attacks. This paper addresses this issue by optimizing for the least number of sensors to achieve multiple coverage of non-simply connected domains by a prescribed number of sensors. We introduce a new objective function for the greedy (next-best-view) algorithm to design efficient and robust sensor networks and derive theoretical bounds on the network's optimality. We further introduce a Deep Learning model to accelerate the algorithm for near real-time computations. The Deep Learning model requires the generation of training examples. Correspondingly, we show that understanding the geometric properties of the training data set provides important insights into the performance and training process of deep learning techniques. Finally, we demonstrate that a simple parallel version of the greedy approach using a simpler objective can be highly competitive.
传感器部署优化方法已经得到了广泛研究。它们可以应用于广泛的领域,包括监测已知环境、5G塔的最佳位置和导弹防御系统的部署。然而,很少有工作探讨传感器网络关于传感器故障或对抗攻击的鲁棒性和效率。本文通过优化传感器数量,实现对非简单连通领域指定传感器数量的多重覆盖,解决了这个问题。我们引入了一个新的目标函数,用于贪婪(下一个最佳视角)算法,以设计高效和鲁棒的传感器网络,并得出关于网络最优性的理论界线。我们进一步引入了一个深度学习模型,以加速算法的近实时计算。深度学习模型需要生成训练示例。因此,我们证明了理解训练数据集的几何特征提供了对深度学习技术性能和训练过程的重要见解。最后,我们证明了使用更简单的目标函数的简单并行版本可以具有很高的竞争力。
https://arxiv.org/abs/2405.09096
This study developed an explainable AI for ship collision avoidance. Initially, a critic network composed of sub-task critic networks was proposed to individually evaluate each sub-task in collision avoidance to clarify the AI decision-making processes involved. Additionally, an attempt was made to discern behavioral intentions through a Q-value analysis and an Attention mechanism. The former focused on interpreting intentions by examining the increment of the Q-value resulting from AI actions, while the latter incorporated the significance of other ships in the decision-making process for collision avoidance into the learning objective. AI's behavioral intentions in collision avoidance were visualized by combining the perceived collision danger with the degree of attention to other ships. The proposed method was evaluated through a numerical experiment. The developed AI was confirmed to be able to safely avoid collisions under various congestion levels, and AI's decision-making process was rendered comprehensible to humans. The proposed method not only facilitates the understanding of DRL-based controllers/systems in the ship collision avoidance task but also extends to any task comprising sub-tasks.
这项研究开发了一个可解释性AI用于避碰。最初,提出了一种由子任务批评网络组成的批评网络,以单独评估避碰中的每个子任务,以阐明涉及避碰AI决策过程。此外,通过Q值分析和关注机制试图通过分析AI行动产生的Q值增量来辨别行为意图。前一个方法专注于通过检查AI行动产生的Q值的增加来解释意图,而另一个方法将其他船舶在避碰决策过程中的重要性纳入学习目标。通过将感知避碰危险与关注其他船舶的程度相结合,可视化了避碰AI的行为意图。所提出的方法通过数值实验进行了评估。经证实,该AI在各种拥塞级别下能够安全避碰,并且AI的决策过程对人类是可理解的。所提出的方法不仅有助于在避碰任务中理解基于强化学习的控制器/系统,而且还能扩展到包括子任务在内的任何任务。
https://arxiv.org/abs/2405.09081
We introduce BEVRender, a novel learning-based approach for the localization of ground vehicles in Global Navigation Satellite System (GNSS)-denied off-road scenarios. These environments are typically challenging for conventional vision-based state estimation due to the lack of distinct visual landmarks and the instability of vehicle poses. To address this, BEVRender generates high-quality local bird's eye view (BEV) images of the local terrain. Subsequently, these images are aligned with a geo-referenced aerial map via template-matching to achieve accurate cross-view registration. Our approach overcomes the inherent limitations of visual inertial odometry systems and the substantial storage requirements of image-retrieval localization strategies, which are susceptible to drift and scalability issues, respectively. Extensive experimentation validates BEVRender's advancement over existing GNSS-denied visual localization methods, demonstrating notable enhancements in both localization accuracy and update frequency. The code for BEVRender will be made available soon.
我们提出了BEVRender,一种新的基于学习的在GNSS拒绝的离线场景中定位地面车辆的新方法。这些环境通常对传统视觉状态估计方法具有挑战性,因为缺乏明显的视觉地标和车辆姿态的不稳定性。为了应对这个问题, BEVRender生成高质量的局部鸟瞰(BEV)图像,并通过模板匹配与地理参考的无人机地图对它们进行对齐,以实现准确的跨视图配准。我们的方法克服了视觉惯性导航系统的固有局限性和图像检索定位策略需要大量存储空间的问题,这些问题容易受到漂移和可扩展性的影响。大量的实验证实,BEVRender在现有GNSS拒绝的视觉本地化方法中取得了显著的进步,证明了在定位精度和更新频率方面的显著改进。BEVRender的代码即将发布。
https://arxiv.org/abs/2405.09001
Autonomous systems often encounter environments and scenarios beyond the scope of their training data, which underscores a critical challenge: the need to generalize and adapt to unseen scenarios in real time. This challenge necessitates new mathematical and algorithmic tools that enable adaptation and zero-shot transfer. To this end, we leverage the theory of function encoders, which enables zero-shot transfer by combining the flexibility of neural networks with the mathematical principles of Hilbert spaces. Using this theory, we first present a method for learning a space of dynamics spanned by a set of neural ODE basis functions. After training, the proposed approach can rapidly identify dynamics in the learned space using an efficient inner product calculation. Critically, this calculation requires no gradient calculations or retraining during the online phase. This method enables zero-shot transfer for autonomous systems at runtime and opens the door for a new class of adaptable control algorithms. We demonstrate state-of-the-art system modeling accuracy for two MuJoCo robot environments and show that the learned models can be used for more efficient MPC control of a quadrotor.
自主系统通常会面临其训练数据范围之外的环境和场景,这凸显了一个关键挑战:需要在实时情况下对未见过的场景进行泛化和适应。这个挑战需要新的数学和算法工具来实现适应和零样本转移。为此,我们利用函数编码器的理论,该理论通过结合神经网络的灵活性和Hilbert空间数学原理来实现零样本转移。使用这个理论,我们首先提出了一种学习由一组神经ODE基础函数组成的动态空间的方法。训练后,所提出的方法可以迅速地在学习到的空间中识别出动态。关键的是,这个计算在在线阶段不需要梯度计算或重新训练。这种方法使得自主系统在实时情况下实现零样本转移,并为新的自适应控制算法打开了大门。我们用两个MuJoCo机器人环境证明了最先进的系统建模精度,并表明所学习到的模型可以用于更有效的MPC控制四旋翼。
https://arxiv.org/abs/2405.08954
For the shape control of deformable free-form surfaces, simulation plays a crucial role in establishing the mapping between the actuation parameters and the deformed shapes. The differentiation of this forward kinematic mapping is usually employed to solve the inverse kinematic problem for determining the actuation parameters that can realize a target shape. However, the free-form surfaces obtained from simulators are always different from the physically deformed shapes due to the errors introduced by hardware and the simplification adopted in physical simulation. To fill the gap, we propose a novel deformation function based sim-to-real learning method that can map the geometric shape of a simulated model into its corresponding shape of the physical model. Unlike the existing sim-to-real learning methods that rely on completely acquired dense markers, our method accommodates sparsely distributed markers and can resiliently use all captured frames -- even for those in the presence of missing markers. To demonstrate its effectiveness, our sim-to-real method has been integrated into a neural network-based computational pipeline designed to tackle the inverse kinematic problem on a pneumatically actuated deformable mannequin.
对于可变形自由曲面形状的控制,仿真在确定驱动参数与变形形状之间的映射方面起着关键作用。通常采用向前运动学映射的微分来求解确定驱动参数以实现目标形状的反向运动学问题。然而,由于硬件误差和物理仿真中简化的采用,从仿真器获得的自由曲面总是与物理变形形状不同。为了填补这一空白,我们提出了一个基于仿真的学习方法的新颖变形函数,可以将模拟模型的几何形状映射到物理模型的相应形状。与现有的仿真学习方法完全基于获得的密集标记不同,我们的方法可以适应稀疏分布的标记,并且可以弹性地使用所有捕获的帧——即使是在缺失标记的情况下。为了证明其有效性,我们的仿真方法已经集成到了一个用于解决气动驱动可变形人体模型的反向运动学问题的神经网络计算管道中。
https://arxiv.org/abs/2405.08935
Non-prehensile manipulation enables fast interactions with objects by circumventing the need to grasp and ungrasp as well as handling objects that cannot be grasped through force closure. Current approaches to non-prehensile manipulation focus on static contacts, avoiding the underactuation that comes with sliding. However, the ability to control sliding contact, essentially removing the no-slip constraint, opens up new possibilities in dynamic manipulation. In this paper, we explore a challenging dynamic non-prehensile manipulation task that requires the consideration of the full spectrum of hybrid contact modes. We leverage recent methods in contact-implicit MPC to handle the multi-modal planning aspect of the task. We demonstrate, with careful consideration of integration between the simple model used for MPC and the low-level tracking controller, how contact-implicit MPC can be adapted to dynamic tasks. Surprisingly, despite the known inaccuracies of frictional rigid contact models, our method is able to react to these inaccuracies while still quickly performing the task. Moreover, we do not use common aids such as reference trajectories or motion primitives, highlighting the generality of our approach. To the best of our knowledge, this is the first application of contact-implicit MPC to a dynamic manipulation task in three dimensions.
非抓取操作使通过绕开抓取和解抓取的需要,以及处理无法通过力闭合来抓取的对象,实现了与物体的高速互动。目前,非抓取操作方法主要关注静态接触,避免与滑动相关的松动。然而,控制滑动接触的能力,本质上消除松动约束,为动态操作带来了新的可能性。在本文中,我们探讨了一个具有挑战性的动态非抓取操作任务,需要考虑混合接触模式的完整范围。我们利用最近在接触隐式MPC中的方法来处理任务的 Multi-modal 规划方面。我们证明了,在仔细考虑简单模型用于MPC和低级跟踪控制器之间的集成的情况下,接触隐式MPC可以适应动态任务。令人惊讶的是,尽管已知摩擦刚性接触模型的不准确度,但我们的方法仍然能够应对这些不准确度,同时仍然快速地完成任务。此外,我们没有使用常见的辅助工具,如参考轨迹或运动原型,这突出了我们方法的普遍性。据我们所知,这是在三维空间中第一次将接触隐式MPC应用于动态操作任务。
https://arxiv.org/abs/2405.08731
Ensuring safety and adapting to the user's behavior are of paramount importance in physical human-robot interaction. Thus, incorporating elastic actuators in the robot's mechanical design has become popular, since it offers intrinsic compliance and additionally provide a coarse estimate for the interaction force by measuring the deformation of the elastic components. While observer-based methods have been shown to improve these estimates, they rely on accurate models of the system, which are challenging to obtain in complex operating environments. In this work, we overcome this issue by learning the unknown dynamics components using Gaussian process (GP) regression. By employing the learned model in a Bayesian filtering framework, we improve the estimation accuracy and additionally obtain an observer that explicitly considers local model uncertainty in the confidence measure of the state estimate. Furthermore, we derive guaranteed estimation error bounds, thus, facilitating the use in safety-critical applications. We demonstrate the effectiveness of the proposed approach experimentally in a human-exoskeleton interaction scenario.
在物理人机交互中确保安全和适应用户行为至关重要。因此,将弹性执行器纳入机器设计中已成为一种流行的方法,因为它提供了固有的顺应性,并且通过测量弹性部件的变形程度,还提供了一个粗略的估计值来计算交互力。虽然基于观察者的方法已经证明了这些估计的改善,但是它们依赖于准确系统模型的准确性,而在复杂操作环境中获得这种准确性是非常困难的。在这项工作中,我们通过使用高斯过程(GP)回归学习未知动态组件。通过将学习到的模型应用于贝叶斯滤波框架,我们提高了估计精度和附加观测器,它明确考虑了状态估计中局部模型不确定性。此外,我们导出了保证估计误差上限,从而促进在关键应用中使用。我们在人机协同操作场景中验证了所提出的方法的实效性。
https://arxiv.org/abs/2405.08711
We present an analytic solution to the 3D Dubins path problem for paths composed of an initial circular arc, a straight component, and a final circular arc. These are commonly called CSC paths. By modeling the start and goal configurations of the path as the base frame and final frame of an RRPRR manipulator, we treat this as an inverse kinematics problem. The kinematic features of the 3D Dubins path are built into the constraints of our manipulator model. Furthermore, we show that the number of solutions is not constant, with up to seven valid CSC path solutions even in non-singular regions. An implementation of solution is available at this https URL.
我们提出了一个解析解来解决由初始圆弧、直线段和最终圆弧组成的路径问题,这些路径通常称为CSC路径。通过将路径的起点和终点配置作为RRPRR操作器的基帧和目标帧,我们将此问题视为反向运动学问题。3D Dubins路径的刚性特征已融入了我们的操作器模型约束中。此外,我们还证明了解决方案的数量不是常数,即使在非奇异区域内,也有多达七种有效的CSC路径解决方案。解决方案的实现可通过此链接https://www.xxx处获得。
https://arxiv.org/abs/2405.08710
This study investigates the computational speed and accuracy of two numerical integration methods, cubature and sampling-based, for integrating an integrand over a 2D polygon. Using a group of rovers searching the Martian surface with a limited sensor footprint as a test bed, the relative error and computational time are compared as the area was subdivided to improve accuracy in the sampling-based approach. The results show that the sampling-based approach exhibits a $14.75\%$ deviation in relative error compared to cubature when it matches the computational performance at $100\%$. Furthermore, achieving a relative error below $1\%$ necessitates a $10000\%$ increase in relative time to calculate due to the $\mathcal{O}(N^2)$ complexity of the sampling-based method. It is concluded that for enhancing reinforcement learning capabilities and other high iteration algorithms, the cubature method is preferred over the sampling-based method.
这项研究探讨了两种数值积分方法:立方和基于采样的方法,在整合一个二维多边形中的积分多项式的计算速度和精度。使用一组轮式机器人,其有限传感器足迹作为火星表面测试台,将采样的精度与分段面积的提高精度进行比较,当采样方法在计算性能达到100%时。结果显示,与立方相比,基于采样的方法在相对误差方面存在14.75%的偏差。此外,为了实现相对误差低于1%,需要将相对时间增加10000%以计算由于采样的方法 $\mathcal{O}(N^2)$ 复杂性。因此,结论是,为了增强强化学习能力和其他高迭代算法,立方方法比基于采样的方法更受欢迎。
https://arxiv.org/abs/2405.08691