Frontiers of Information Technology & Electronic Engineering  2024 Vol.25 No.5 P.755-762


SEVAR: a stereo event camera dataset for virtual and augmented reality

Author(s):  Yuda DONG, Zetao CHEN, Xin HE, Lijun LI, Zichao SHU, Yinong CAO, Junchi FENG, Shijie LIU, Chunlai LI, Jianyu WANG

Affiliation(s):  Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Hangzhou 310024, China; more

Corresponding email(s):   dongyuda21@mails.ucas.ac.cn, zetao-chen@ylab.ac.cn, xinhe@ucas.ac.cn

Event cameras, characterized by their low latency, large dynamic range, and extremely high temporal resolution, have recently received increasing attention. These features make them particularly well-suited for virtual/augmented reality (VR/AR) applications. To facilitate the development of three-dimensional (3D) perception and navigation algorithms in VR/AR applications using event cameras, we introduce the Stereo Event camera dataset for Virtual and Augmented Reality (SEVAR), which comprises a wide variety of head-mounted indoor sequences, including scenarios with rapid motion and a large dynamic range. We present the first comprehensive set of VR/AR datasets captured with an event-based stereo camera, a regular stereo camera at 30 Hz, and an inertial measurement unit at 1000 Hz. The camera placement, field of view (FoV), and resolution match those of the head-mounted device, such as Meta Quest Pro. All sensors are time-synchronized in the hardware. Ground truth poses captured by a motion capture system are also available for trajectory evaluation. The sequences include several common scenarios, and cover the specific challenges targeted by event cameras. The dataset can be found at https://github.com/sevar-dataset/sevar.


摘要:近年来,事件相机以其低延迟、高动态范围和高时间分辨率等特点受到越来越多关注。这些特点使它特别适合应用于虚拟和增强现实(VR/AR)领域。为了促进事件相机在VR/AR应用中的三维感知和定位算法的发展,我们引入用于虚拟和增强现实场景的双目事件相机数据集(SEVAR)。该数据集以头戴式设备为主体,覆盖几种常见的室内场景序列,包括面向事件相机的快速运动和高动态范围的挑战性情景。我们发布了第一组VR/AR场景的感知和定位数据集,该数据集由双目事件体相机、30 Hz双目标准相机和1000 Hz惯性测量单元采集。相机的放置方式、视场和分辨率与商用头戴设备(如Meta Quest Pro)相似。所有传感器在硬件上进行时间同步。为更好地开展定位精度和轨迹的评估,提供了由动作捕捉系统捕捉的位姿真值。数据集见https://github.com/sevar-dataset/sevar。


[1]Alzugaray I, Chli M, 2018. Asynchronous corner detection and tracking for event cameras in real time. IEEE Rob Autom Lett, 3(4):3177-3184.

[2]Barranco F, Fermuller C, Aloimonos Y, et al., 2016. A dataset for visual navigation with neuromorphic methods. Front Neurosci, 10:49.

[3]Calabrese E, Taverni G, Easthope CA, et al., 2019. DHP19: dynamic vision sensor 3D human pose dataset. IEEE/ CVF Conf on Computer Vision and Pattern Recognition Workshops, p.1695-1704.

[4]Campos C, Elvira R, Rodríguez JJG, et al., 2021. ORB- SLAM3: an accurate open-source library for visual, visual- inertial, and multimap SLAM. IEEE Trans Rob, 37(6):1874-1890.

[5]Delmerico J, Cieslewski T, Rebecq H, et al., 2019. Are we ready for autonomous drone racing?The UZH-FPV drone racing dataset. Int Conf on Robotics and Automation, p.6713-6719.

[6]Furgale P, Rehder J, Siegwart R, 2013. Unified temporal and spatial calibration for multi-sensor systems. IEEE/RSJ Int Conf on Intelligent Robots and Systems, p.1280-1286.

[7]Furrer F, Fehr M, Novkovic T, et al., 2018. Evaluation of combined time-offset estimation and hand-eye calibration on robotic datasets. In: Hutter M, Siegwart R (Eds.), Field and Service Robotics. Springer Proceedings in Advanced Robotics, Vol. 5. Springer, Cham, p.145-159.

[8]Gao L, Liang YX, Yang JQ, et al., 2022. VECtor: a versatile event-centric benchmark for multi-sensor SLAM. IEEE Rob Autom Lett, 7(3):8217-8224.

[9]Gehrig M, Aarents W, Gehrig D, et al., 2021. DSEC: a stereo event camera dataset for driving scenarios. IEEE Rob Autom Lett, 6(3):4947-4954.

[10]Geiger A, Lenz P, Stiller C, et al., 2013. Vision meets robotics: the KITTI dataset. Int J Rob Res, 32(11):1231-1237.

[11]Geneva P, Eckenhoff K, Lee W, et al., 2020. OpenVINS: a research platform for visual-inertial estimation. IEEE Int Conf on Robotics and Automation, p.4666-4672.

[12]Hu YH, Liu SC, Delbruck T, 2021. v2e: from video frames to realistic DVS events. IEEE/CVF Conf on Computer Vision and Pattern Recognition Workshops, p.1312-1321.

[13]Klenk S, Chui J, Demmel N, et al., 2021. TUM-VIE: the TUM stereo visual-inertial event dataset. IEEE/RSJ Int Conf on Intelligent Robots and Systems, p.8601-8608.

[14]Liu YZ, Fu YJ, Chen FD, et al., 2021. Simultaneous localization and mapping related datasets: a comprehensive survey. https://arxiv.org/abs/2102.04036

[15]Olson E, 2011. AprilTag: a robust and flexible visual fiducial system. IEEE Int Conf on Robotics and Automation, p.3400-3407.

[16]Sturm J, Engelhard N, Endres F, et al., 2012. A benchmark for the evaluation of RGB-D SLAM systems. IEEE/RSJ Int Conf on Intelligent Robots and Systems, p.573-580.

[17]Weikersdorfer D, Adrian DB, Cremers D, et al., 2014. Event-based 3D SLAM with a depth-augmented dynamic vision sensor. IEEE Int Conf on Robotics and Automation, p.359-364.

[18]Zhu AZ, Thakur D, Ozaslan T, et al., 2018. The multivehicle stereo event camera dataset: an event camera dataset for 3D perception. IEEE Rob Autom Lett, 3(3):2032-2039.

