CLC number: TP31
On-line Access: 2024-08-27
Received: 2023-10-17
Revision Accepted: 2024-05-08
Crosschecked: 2023-11-26
Cited: 0
Clicked: 1204
Citations: Bibtex RefMan EndNote GB/T7714
https://orcid.org/0000-0003-0177-0157
Tianrun CHEN, Runlong CAO, Zejian LI, Ying ZANG, Lingyun SUN. Deep3DSketch-im: rapid high-fidelity AI 3D model generation by single freehand sketches[J]. Frontiers of Information Technology & Electronic Engineering, 2024, 25(1): 149-159.
@article{title="Deep3DSketch-im: rapid high-fidelity AI 3D model generation by single freehand sketches",
author="Tianrun CHEN, Runlong CAO, Zejian LI, Ying ZANG, Lingyun SUN",
journal="Frontiers of Information Technology & Electronic Engineering",
volume="25",
number="1",
pages="149-159",
year="2024",
publisher="Zhejiang University Press & Springer",
doi="10.1631/FITEE.2300314"
}
%0 Journal Article
%T Deep3DSketch-im: rapid high-fidelity AI 3D model generation by single freehand sketches
%A Tianrun CHEN
%A Runlong CAO
%A Zejian LI
%A Ying ZANG
%A Lingyun SUN
%J Frontiers of Information Technology & Electronic Engineering
%V 25
%N 1
%P 149-159
%@ 2095-9184
%D 2024
%I Zhejiang University Press & Springer
%DOI 10.1631/FITEE.2300314
TY - JOUR
T1 - Deep3DSketch-im: rapid high-fidelity AI 3D model generation by single freehand sketches
A1 - Tianrun CHEN
A1 - Runlong CAO
A1 - Zejian LI
A1 - Ying ZANG
A1 - Lingyun SUN
J0 - Frontiers of Information Technology & Electronic Engineering
VL - 25
IS - 1
SP - 149
EP - 159
%@ 2095-9184
Y1 - 2024
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/FITEE.2300314
Abstract: The rise of artificial intelligence generated content (AIGC) has been remarkable in the language and image fields, but artificial intelligence (AI) generated three-dimensional (3D) models are still under-explored due to their complex nature and lack of training data. The conventional approach of creating 3D content through computer-aided design (CAD) is labor-intensive and requires expertise, making it challenging for novice users. To address this issue, we propose a sketch-based 3D modeling approach, Deep3Dsketch-im, which uses a single freehand sketch for modeling. This is a challenging task due to the sparsity and ambiguity. Deep3Dsketch-im uses a novel data representation called the signed distance field (SDF) to improve the sketch-to-3D model process by incorporating an implicit continuous field instead of voxel or points, and a specially designed neural network that can capture point and local features. Extensive experiments are conducted to demonstrate the effectiveness of the approach, achieving state-of-the-art (SOTA) performance on both synthetic and real datasets. Additionally, users show more satisfaction with results generated by Deep3Dsketch-im, as reported in a user study. We believe that Deep3Dsketch-im has the potential to revolutionize the process of 3D modeling by providing an intuitive and easy-to-use solution for novice users.
[1]Cai YJ, Wang YW, Zhu YH, et al., 2021. A unified 3D human motion synthesis model via conditional variational auto-encoder. IEEE/CVF Int Conf on Computer Vision, p.11625-11635.
[2]Chang AX, Funkhouser T, Guibas L, et al., 2015. ShapeNet: an information-rich 3D model repository. https://arxiv.org/abs/1512.03012
[3]Chen DY, Tian XP, Shen YT, et al., 2003. On visual similarity based 3D model retrieval. Comput Graph Forum, 22(3):223-232.
[4]Chen TR, Fu CL, Zhu LY, et al., 2023a. Deep3DSketch: 3D modeling from free-hand sketches with view- and structural-aware adversarial training. IEEE Int Conf on Acoustics, Speech and Signal Processing, p.1-5.
[5]Chen TR, Fu CL, Zang Y, et al., 2023b. Deep3DSketch+: rapid 3D modeling from single free-hand sketches. Proc 29th Int Conf on Multimedia Modeling, p.16-28.
[6]Chen TR, Ding CT, Zhu LY, et al., 2023c. Reality3DSketch: rapid 3D modeling of objects from single freehand sketches. IEEE Trans Multim, early access.
[7]Chen ZQ, Zhang H, 2019. Learning implicit fields for generative shape modeling. IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.5932-5941.
[8]Chester I, 2007. Teaching for CAD expertise. Int J Technol Des Educ, 17:23-35.
[9]Cohen JM, Markosian L, Zeleznik RC, et al., 1999. An interface for sketching 3D curves. Symp on Interactive 3D Graphics, p.17-21.
[10]Deng CY, Huang JH, Yang YL, 2020. Interactive modeling of lofted shapes from a single image. Comput Visual Med, 6(3):279-289.
[11]Fu X, Zhang SZ, Chen TR, et al., 2022. Panoptic NeRF: 3D-to-2D label transfer for panoptic urban scene segmentation. Int Conf on 3D Vision, p.1-11.
[12]Gao CJ, Yu Q, Sheng L, et al., 2022. SketchSampler: sketch-based 3D reconstruction via view-dependent depth sampling. Proc 17th European Conf on Computer Vision, p.464-479.
[13]Guillard B, Remelli E, Yvernay P, et al., 2021. Sketch2Mesh: reconstructing and editing 3D shapes from sketches. IEEE/CVF Int Conf on Computer Vision, p.13003-13012.
[14]Huang SS, Wang YH, 2024. Controllable image generation based on causal representation learning. Front Inform Technol Electron Eng, 25(1):135-148.
[15]Jo K, Shim G, Jung S, et al., 2023. CG-NeRF: conditional generative neural radiance fields for 3D-aware image synthesis. IEEE/CVF Winter Conf on Applications of Computer Vision, p.724-733.
[16]Kar A, Häne C, Malik J, 2017. Learning a multi-view stereo machine. Proc 31st Int Conf on Neural Information Processing Systems, p.364-375.
[17]Kato H, Ushiku Y, Harada T, 2018. Neural 3D mesh renderer. IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.3907-3916.
[18]Lei YM, Li JQ, 2024. Prompt learning in computer vision: a survey. Front Inform Technol Electron Eng, 25(1):42-63.
[19]Li CJ, Pan H, Bousseau A, et al., 2020. Sketch2CAD: sequential CAD modeling by sketching in context. ACM Trans Graph, 39(6):164.
[20]Lin CH, Wang CY, Lucey S, 2020. SDF-SRN: learning signed distance 3D object reconstruction from static images. Proc 34th Int Conf on Neural Information Processing Systems, Article 961.
[21]Lin GY, Yang L, Zhang CY, et al., 2023. Patch-Grid: an efficient and feature-preserving neural implicit surface representation. https://arxiv.org/abs/2308.13934
[22]Liu SC, Saito S, Chen WK, et al., 2019a. Learning to infer implicit surfaces without 3D supervision. Proc 33rd Int Conf on Neural Information Processing Systems, Article 32.
[23]Liu SC, Chen WK, Li TY, et al., 2019b. Soft rasterizer: a differentiable renderer for image-based 3D reasoning. IEEE/CVF Int Conf on Computer Vision, p.7707-7716.
[24]Mahapatra C, Jensen JK, McQuaid M, et al., 2019. Barriers to end-user designers of augmented fabrication. CHI Conf on Human Factors in Computing Systems, Article 383.
[25]Metzer G, Richardson E, Patashnik O, et al., 2022. Latent-NeRF for shape-guided generation of 3D shapes and textures. https://arxiv.org/abs/2211.07600
[26]Michel O, Bar-On R, Liu R, et al., 2022. Text2Mesh: text-driven neural stylization for meshes. IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.13482-13492.
[27]Park JJ, Florence P, Straub J, et al., 2019. DeepSDF: learning continuous signed distance functions for shape representation. IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.165-174.
[28]Reddy EJ, Rangadu VP, 2018. Development of knowledge based parametric CAD modeling system for spur gear: an approach. Alex Eng J, 57(4):3139-3149.
[29]Seufert M, 2019. Fundamental advantages of considering quality of experience distributions over mean opinion scores. Proc 11th Int Conf on Quality of Multimedia Experience, p.1-6.
[30]Tong X, 2022. Three-dimensional shape space learning for visual concept construction: challenges and research progress. Front Inform Technol Electron Eng, 23(9):1290-1297.
[31]Tong YZ, Yuan JK, Zhang M, et al., 2023. Quantitatively measuring and contrastively exploring heterogeneity for domain generalization. Proc 29th ACM SIGKDD Conf on Knowledge Discovery and Data Mining, p.2189-2200.
[32]Wang F, Kang L, Li Y, 2015. Sketch-based 3D shape retrieval using convolutional neural networks. IEEE Conf on Computer Vision and Pattern Recognition, p.1875-1883.
[33]Wang WY, Xu QG, Ceylan D, et al., 2019. DISN: deep implicit surface network for high-quality single-view 3D reconstruction. Proc 33rd Int Conf on Neural Information Processing Systems, Article 45.
[34]Xu R, Wang ZX, Dou ZY, et al., 2022. RFEPS: reconstructing feature-line equipped polygonal surface. ACM Trans Graph, 41(6):228.
[35]Xu R, Dou ZY, Wang NN, et al., 2023. Globally consistent normal orientation for point clouds by regularizing the winding-number field. ACM Trans Graph, 42(4):111.
[36]Yang L, Liang YQ, Li X, et al., 2023. Neural parametric surfaces for shape modeling. https://arxiv.org/abs/2309.09911
[37]Yao SY, Zhong RZ, Yan YC, et al., 2022. DFA-NeRF: personalized talking head generation via disentangled face attributes neural rendering. https://arxiv.org/abs/2201.00791
[38]Yu A, Ye V, Tancik M, et al., 2021. pixelNeRF: neural radiance fields from one or few images. IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.4576-4585.
[39]Zang Y, Fu CL, Chen TR, et al., 2023. Deep3DSketch+: obtaining customized 3D model by single free-hand sketch through deep learning. https://arxiv.org/abs/2310.18609
[40]Zhang SH, Guo YC, Gu QW, 2021. Sketch2Model: view-aware 3D modeling from single free-hand sketches. IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.6000-6017.
[41]Zhang SZ, Peng SD, Chen TR, et al., 2023. Painting 3D nature in 2D: view synthesis of natural scenes from a single semantic mask. IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.8518-8528.
[42]Zhong Y, Gryaditskaya Y, Zhang HG, et al., 2020. Deep sketch-based modeling: tips and tricks. Int Conf on 3D Vision, p.543-552.
[43]Zhou J, Ke P, Qiu XP, et al., 2023. ChatGPT: potential, prospects, and limitations. Front Inform Technol Electron Eng, early access.
[44]Zhu DD, Li YC, Zhang M, et al., 2023a. Bridging the gap: neural collapse inspired prompt tuning for generalization under class imbalance. https://arxiv.org/abs/2306.15955v2
[45]Zhu DD, Li YC, Shao YF, et al., 2023b. Generalized universal domain adaptation with generative flow networks. Proc 31st ACM Int Conf on Multimedia, p.8304-8315.
[46]Zhu DD, Li YC, Yuan JK, et al., 2023c. Universal domain adaptation via compressive attention matching. IEEE/CVF Int Conf on Computer Vision, p.6974-6985.
Open peer comments: Debate/Discuss/Question/Opinion
<1>