Journal of Zhejiang University

Frontiers of Information Technology & Electronic Engineering 2022 Vol.23 No.9 P.1290-1297

http://doi.org/10.1631/FITEE.2200318

Three-dimensional shape space learning for visual concept construction: challenges and research progress

Author(s): Xin TONG
Affiliation(s): 1. Microsoft Research Asia, Beijing 100080, China
Corresponding email(s):
Key Words:

Share this article to： More <<< Previous Article \|Next Article >>>

Xin TONG. Three-dimensional shape space learning for visual concept construction: challenges and research progress[J]. Frontiers of Information Technology & Electronic Engineering, 2022, 23(9): 1290-1297.

@article{title="Three-dimensional shape space learning for visual concept construction: challenges and research progress",
author="Xin TONG",
journal="Frontiers of Information Technology & Electronic Engineering",
volume="23",
number="9",
pages="1290-1297",
year="2022",
publisher="Zhejiang University Press & Springer",
doi="10.1631/FITEE.2200318"
}

%0 Journal Article
%T Three-dimensional shape space learning for visual concept construction: challenges and research progress
%A Xin TONG
%J Frontiers of Information Technology & Electronic Engineering
%V 23
%N 9
%P 1290-1297
%@ 2095-9184
%D 2022
%I Zhejiang University Press & Springer
%DOI 10.1631/FITEE.2200318

TY - JOUR
T1 - Three-dimensional shape space learning for visual concept construction: challenges and research progress
A1 - Xin TONG
J0 - Frontiers of Information Technology & Electronic Engineering
VL - 23
IS - 9
SP - 1290
EP - 1297
%@ 2095-9184
Y1 - 2022
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/FITEE.2200318

Abstract
Chinese Summary
Academic Network
Reviewer Comment

Abstract: Human beings can easily categorize three-dimensional (3D) objects with similar shapes and functions into a set of “visual concepts” and learn “visual knowledge” of the surrounding 3D real world (Pan, 2019). Developing efficient methods to learn the computational representation of the visual concept and the visual knowledge is a critical task in artificial intelligence (Pan, 2021a). A crucial step to this end is to learn the shape space spanned by all 3D objects that belong to one visual concept. In this paper, we present the key technical challenges and recent research progress in 3D shape space learning and discuss the open problems and research opportunities in this area.

面向视觉概念构建的三维形状空间学习：挑战与研究进展

童欣
微软亚洲研究院，中国北京市，100080
摘要：人类可以熟练的对真实世界中物体按照形状或者功能进行分类，并在思维中建立每类物体的视觉概念和周围真实世界的视觉知识（Pan, 2019）。Pan（2021）指出建立这些视觉概念和视觉知识的计算表达是发展下一代人工智能的一个关键步骤。学习同一视觉概念下所有物体的三维形状空间是实现视觉概念计算表达的一个关键步骤。本文提出三维形状空间学习中面临的关键技术挑战，并围绕这些技术挑战回顾了这一领域的研究进展，最后讨论了三维形状空间学习领域的研究趋势和未来发展方向。

关键词：视觉概念；视觉知识；三维几何学习；三维形状空间；三维结构

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article

Reference

[1]Bai S, Bai X, Zhou ZC, et al., 2016. GIFT: a real-time and scalable 3D shape search engine. IEEE Conf on Computer Vision and Pattern Recognition, p.5023-5032.

[2]Cao C, Weng YL, Zhou S, et al., 2014. FaceWareHouse: a 3D facial expression database for visual computing. IEEE Trans Visual Comput Graph, 20(3):413-425.

[3]Chan ER, Monteiro M, Kellnhofer P, et al., 2021. pi-GAN: periodic implicit generative adversarial networks for 3D-aware image synthesis. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.5799-5809.

[4]Chen ZQ, Zhang H, 2019. Learning implicit fields for generative shape modeling. IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.5932-5941.

[5]Deng Y, Yang JL, Tong X, 2021. Deformed implicit field: modeling 3D shapes with learned dense correspondence. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.10286-10296.

[6]Deng Y, Yang J, Xiang J, et al., 2022. GRAM: generative radiance manifolds for 3D-aware image generation. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.10673-10683.

[7]Egger B, Smith WA, Tewari A, 2020. 3D morphable face models past, present, and future. ACM Trans Graph, 39(5):157.

[8]Gadelha M, Maji S, Wang R, 2017. 3D shape induction from 2D views of multiple objects. Int Conf on 3D Vision, p.402-411.

[9]Groueix T, Fisher M, Kim VG, et al., 2018. A Papier-Mache approach to learning 3D surface generation. IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.216-224.

[10]Hughes JF, van Dam A, McGuire M, et al., 2013. Computer Graphics: Principles and Practice (3^rd Ed.). Addison-Wesley, Upper Saddle River, USA.

[11]Jiang C, Huang J, Tagliasacchi A, et al., 2020. ShapeFlow: learnable deformation flows among 3D shapes. Advances in Neural Information Processing Systems 33, p.9745-9757.

[12]Jin YW, Jiang DQ, Cai M, 2020. 3D reconstruction using deep learning: a survey. Commun Inform Syst, 20(4):389-413.

[13]Li X, Dong Y, Peers P, et al., 2019. Synthesizing 3D shapes from silhouette image collections using multi-projection generative adversarial networks. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.5530-5539.

[14]Liu F, Liu XM, 2020. Learning implicit functions for topology-varying dense 3D shape correspondence. Proc 34th Int Conf on Neural Information Processing Systems, p.4823-4834.

[15]Loper M, Mahmood N, Romero J, et al., 2015. SMPL: a skinned multi-person linear model. ACM Trans Graph, 34(6):248.

[16]Lun ZL, Gadelha M, Kalogerakis E, et al., 2017. 3D shape reconstruction from sketches via multi-view convolutional networks. Proc Int Conf on 3D Vision, p.67-77. http://arxiv.org/abs/1707.06375

[17]Masci J, Boscaini D, Bronstein MM, et al., 2015. Geodesic convolutional neural networks on Riemannian manifolds. Proc IEEE Int Conf on Computer Vision Workshop, p.832-840.

[18]Měch R, Prusinkiewicz P, 1996. Visual models of plants interacting with their environment. Proc 23rd Annual Conf on Computer Graphics and Interactive Techniques, p.397-410.

[19]Mescheder L, Oechsle M, Niemeyer M, et al., 2019. Occupancy networks: learning 3D reconstruction in function space. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.4455-4465.

[20]Mo KC, Zhu SL, Chang AX, et al., 2019. PartNet: a large-scale benchmark for fine-grained and hierarchical part-level 3D object understanding. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.909-918.

[21]Müller P, Wonka P, Haegler S, et al., 2006. Procedural modeling of buildings. ACM SIGGRAPH Papers, p.614-623.

[22]Niu CJ, Li J, Xu K, 2018. Im2Struct: recovering 3D shape structure from a single RGB image. IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.4521-4529.

[23]Pan YH, 2019. On visual knowledge. Front Inform Technol Electron Eng, 20(8):1021-1025.

[24]Pan YH, 2021a. Miniaturized five fundamental issues about visual knowledge. Front Inform Technol Electron Eng, 22(5):615-618.

[25]Pan YH, 2021b. On visual understanding. Front Inform Technol Electron Eng, early access.

[26]Park JJ, Florence P, Straub J, et al., 2019. DeepSDF: learning continuous signed distance functions for shape representation. IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.165-174.

[27]Paschalidou D, Katharopoulos A, Geiger A, et al., 2021. Neural parts: learning expressive 3D shape abstractions with invertible neural networks. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.3204-3215.

[28]Qi CR, Su H, Mo KC, et al., 2017. PointNet: deep learning on point sets for 3D classification and segmentation. IEEE Conf on Computer Vision and Pattern Recognition, p.77-85.

[29]Riegler G, Ulusoy AO, Geiger A, 2017. OctNet: learning deep 3D representations at high resolutions. IEEE Conf on Computer Vision and Pattern Recognition, p.6620-6629.

[30]Sinha A, Bai J, Ramani K, 2016. Deep learning 3D shape surfaces using geometry images. Proc 14th European Conf on Computer Vision, p.223-240.

[31]Su H, Maji S, Kalogerakis E, et al., 2015. Multi-view convolutional neural networks for 3D shape recognition. IEEE Int Conf on Computer Vision, p.945-953.

[32]Sun CY, Zou QF, Tong X, et al., 2019. Learning adaptive hierarchical cuboid abstractions of 3D shape collections. ACM Trans Graph, 38(6):241.

[33]Tulsiani S, Su H, Guibas LJ, et al., 2017. Learning shape abstractions by assembling volumetric primitives. IEEE Conf on Computer Vision and Pattern Recognition, p.1466-1474.

[34]Wang NY, Zhang YD, Li ZW, et al., 2018. Pixel2Mesh: generating 3D mesh models from single RGB images. Proc 15th European Conf on Computer Vision, p.55-71.

[35]Wang PS, Liu Y, Guo YX, et al., 2017. O-CNN: octree-based convolutional neural networks for 3D shape analysis. ACM Trans Graph, 36(4):72.

[36]Wang PS, Liu Y, Tong X, 2022. Dual octree graph networks for learning adaptive volumetric shape representations. ACM Trans Graph, 41(4):103.

[37]Wen C, Zhang YD, Li ZW, et al., 2019. Pixel2Mesh++: multi-view 3D mesh generation via deformation. IEEE/CVF Int Conf on Computer Vision, p.1042-1051.

[38]Wu ZR, Song SR, Khosla A, et al., 2015. 3D ShapeNets: a deep representation for volumetric shapes. Proc IEEE Conf on Computer Vision and Pattern Recognition, p.1912-1920.

[39]Xiao YP, Lai YK, Zhang FL, et al., 2020. A survey on deep geometry learning: from a representation perspective. Comput Visual Med, 6(2):113-133.

[40]Yang J, Mo KC, Lai YK, et al., 2023. DSG-Net: learning disentangled structure and geometry for 3D shape generation. ACM Trans Graph, 42(1):1.

[41]Yang KZ, Chen XJ, 2021. Unsupervised learning for cuboid shape abstraction via joint segmentation from point clouds. ACM Trans Graph, 40(4):152.

[42]Yu FG, Liu K, Zhang Y, et al., 2019. PartNet: a recursive part decomposition network for fine-grained and hierarchical shape segmentation. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.9483-9492.

[43]Yu LQ, Li XZ, Fu CW, et al., 2018. PU-Net: point cloud upsampling network. IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.2790-2799.

[44]Zheng XY, Liu Y, Wang PS, et al., 2022. SDF-StyleGAN: implicit SDF-based StyleGAN for 3D shape generation. https://arxiv.org/abs/2206.12055

[45]Zheng ZR, Yu T, Dai QH, et al., 2021. Deep implicit templates for 3D shape representation. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.1429-1439.

[46]Zuffi S, Kanazawa A, Jacobs DW, et al., 2017. 3D Menagerie: modeling the 3D shape and pose of animals. IEEE Conf on Computer Vision and Pattern Recognition, p.5524-5532.

Open peer comments: Debate/Discuss/Question/Opinion

<1>