A novel robotic visual perception framework for underwater operation

Author(s):  Yue LU, Xingyu CHEN, Zhengxing WU, Junzhi YU, Li WEN

Affiliation(s):  State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China; more

Corresponding email(s):   junzhi.yu@ia.ac.cn

Key Words:  Underwater operation, Robotic perceptions, Visual restoration, Video object detection

Underwater robotic operation usually requires visual perception (e.g., object detection and tracking), but underwater scenes have poor visual quality and represent a special domain which can affect the accuracy of visual perception. In addition, detection continuity and stability are also important for robotic perceptions, but the commonly used static accuracy-based evaluation (i.e., average precision (AP)) is insufficient to reflect detector performance across time. In response to these two problems, we present a design for a novel robotic visual perception framework. First, we generally investigate the relationship of a quality-diverse data domain and visual restoration in detection performance. As a result, although domain quality has an ignorable effect on within-domain detection accuracy, visual restoration is beneficial to detection in real sea scenarios by reducing the domain shift. Moreover, non-reference assessments are proposed for detection continuity and stability based on object tracklets. Further, an online tracklet refinement (OTR) is developed to improve the temporal performance of detectors. Finally, combined with visual restoration, an accurate and stable underwater robotic visual perception framework is established. Smalloverlap suppression (SOS) is proposed to extend video object detection (VID) methods to a single-object tracking task, leading to the flexibility to switch between detection and tracking. Extensive experiments were conducted on the ImageNet VID dataset and real-world robotic tasks to verify the correctness of our analysis and the superiority of our proposed approaches. The codes are available at https://github.com/yrqs/VisPerception.

