丁潇,张旭东,范之国,孙锐.基于光场解耦的6D位姿估计方法[J].电子测量与仪器学报,2024,38(7):46-54
基于光场解耦的6D位姿估计方法
6D pose estimation method based on light field decoupling
  
DOI:
中文关键词:  位姿估计  光场解耦  空间角度特征提取  特征融合
英文关键词:pose estimation  light field decoupling  spatial and angle feature extraction  feature fusion
基金项目:安徽省科技重大专项(202103a06020010)、安徽省自然科学基金(2208085MF158)项目资助
作者单位
丁潇 合肥工业大学计算机与信息学院合肥230601 
张旭东 合肥工业大学计算机与信息学院合肥230601 
范之国 合肥工业大学计算机与信息学院合肥230601 
孙锐 合肥工业大学计算机与信息学院合肥230601 
AuthorInstitution
Ding Xiao School of Computer and Information, Hefei University of Technology, Hefei 230601, China 
Zhang Xudong School of Computer and Information, Hefei University of Technology, Hefei 230601, China 
Fan Zhiguo School of Computer and Information, Hefei University of Technology, Hefei 230601, China 
Sun Rui School of Computer and Information, Hefei University of Technology, Hefei 230601, China 
摘要点击次数: 44
全文下载次数: 306
中文摘要:
      光场成像技术能同时捕捉场景中光线的空间和角度信息,被广泛应用到许多计算机视觉任务中。针对基于RGB图像位姿估计方法在严重遮挡和截断、光照变化、物体和背景相似等复杂场景下难以有效预估出位姿的问题,提出一种光场解耦特征融合的两阶段6D位姿估计方法。该方法采用多种特征提取器解耦光场宏像素图像并将其映射到特征空间,并引入注意力机制融合空间角度及EPI信息,为下游位姿估计网络提供有效可靠的关键特征。同时,将反投影应用到关键点预测网络以减少特征传递过程中信息的损耗。在光场位姿估计数据集LF-6Dpose上的实验表明,该方法在平均最近点三维距离(ADD S)和二维投影(2D Project)两个指标下的结果分别为91.37%和70.12%,在三维距离指标上相比现有先进方法提升12.5%,能够更好地解决复杂场景下的目标6D位姿估计问题。
英文摘要:
      Light field imaging technology can capture both the spatial and angular information of light in a scene simultaneously. It is commonly used in various computer vision tasks. A two-stage 6D pose estimation method leveraging light field decoupled feature fusion is proposed. The aim of this method is to overcome the limitations of RGB image pose estimation methods when predicting pose in complex scenes with severe occlusion and truncation, illumination changes, and similarity between objects and backgrounds. Various feature extractors are utilised to decouple the light field macro-pixel image and map it to the feature space. An attention mechanism is then introduced to fuse the spatial, angular and EPI information to provide effective and reliable features for the downstream pose estimation network. Additionally, the back-projection is applied to the keypoints prediction network to minimise information loss during feature transfer. Experiments on the LF-6Dpose light field pose estimation dataset demonstrate that this method achieves 91.37% and 70.12% for the average closest point 3D distance for symmetric objects (ADD-S) and 2D Project metrics, respectively. This represents a 12.5% improvement compared to existing state-of-the-art methods in 3D distance metrics and more effectively solves the problem of estimating the 6D pose of objects in complex scenes.
查看全文  查看/发表评论  下载PDF阅读器