CLC number: TP391.41
On-line Access: 2024-08-27
Received: 2023-10-17
Revision Accepted: 2024-05-08
Crosschecked: 2023-10-13
Cited: 0
Clicked: 1772
Citations: Bibtex RefMan EndNote GB/T7714
Shanshan HUANG, Yuanhao WANG, Zhili GONG, Jun LIAO, Shu WANG, Li LIU. Controllable image generation based on causal representation learning[J]. Frontiers of Information Technology & Electronic Engineering,in press.https://doi.org/10.1631/FITEE.2300303 @article{title="Controllable image generation based on causal representation learning", %0 Journal Article TY - JOUR
基于因果表征学习的可控图像生成1重庆大学大数据与软件学院,中国重庆市,401331 2西南大学材料与能源学院,中国重庆市,400715 摘要:人工智能生成内容(AIGC)已成为制作各种形式的大规模内容不可或缺的工具,特别是在图像生成和编辑中发挥重要作用。然而,图像生成和编辑的可解释性和可控性仍然是一个挑战。现有人工智能方法由于忽略图像内部的因果关系,往往难以生成既灵活又可控的图像。为解决这个问题,本文开发了一种新颖的因果可控图像生成方法,它将因果表征学习与双向生成对抗网络相结合。本文方法的关键在于使用因果结构学习模块学习图像属性之间的因果关系,并与图像生成模块中的编码器、生成器和联合鉴别器进行联合优化。基于这种方法,不仅可以学习图像潜在空间中的因果表征,进而实现因果可控的图像编辑,还可以利用因果干预操作生成反事实图像。最后,在真实世界的数据集CelebA上进行大量实验。实验结果证明所提方法的合理性和有效性。 关键词组: Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article
Reference[1]Ahuja K, Mahajan D, Wang YX, et al., 2023. Interventional causal representation learning. Proc 43th Int Conf on Machine Learning, p.372-407. ![]() [2]Augustin M, Boreiko V, Croce F, et al., 2022. Diffusion visual counterfactual explanations. Proc 36th Advances in Neural Information Processing Systems, p.364-377. ![]() [3]Brehmer J, de Haan P, Lippe P, et al., 2022. Weakly supervised causal representation learning. Proc 36th Advances in Neural Information Processing Systems, p.38319-38331. ![]() [4]Gao YH, Shen L, Xia ST, 2021. DAG-GAN: causal structure learning with generative adversarial nets. Proc IEEE Int Conf on Acoustics, Speech and Signal Processing, p.3320-3324. ![]() [5]He KM, Zhang XY, Ren SQ, et al., 2016. Deep residual learning for image recognition. Proc IEEE Conf on Computer Vision and Pattern Recognition, p.770-778. ![]() [6]Heusel M, Ramsauer H, Unterthiner T, et al., 2017. GANs trained by a two time-scale update rule converge to a local Nash equilibrium. Proc 31st Int Conf on Neural Information Processing Systems, p.6629-6640. ![]() [7]Huang S, Li Q, Liao J, et al., 2023. An overview of controllable image synthesis: current challenges and future trends. SSRN, Article 4187269. https://ssrn.com/abstract=4187269 ![]() [8]Huang SS, Jin X, Jiang Q, et al., 2022. Deep learning for image colorization: current and future prospects. Eng Appl Artif Intell, 114:105006. ![]() [9]Kocaoglu M, Snyder C, Dimakis AG, et al., 2018. CausalGAN: learning causal implicit generative models with adversarial training. Proc Int Conf on Learning Representations. ![]() [10]Lachapelle S, Brouillard P, Deleu T, et al., 2020. Gradient-based neural DAG learning. Proc 8th Int Conf on Learning Representations. ![]() [11]Lai PK, 2022. DeepSCM: an efficient convolutional neural network surrogate model for the screening of therapeutic antibody viscosity. Comput Struct Biotechnol J, 20:2143-2152. ![]() [12]Leeb F, Annadani Y, Bauer S, et al., 2020. Structural autoencoders improve representations for generation and transfer. https://arxiv.org/abs/2006.07796v1 ![]() [13]Lippe P, Magliacane S, Löwe S, et al., 2022. CITRIS: causal identifiability from temporal intervened sequences. Proc 39th Int Conf on Machine Learning, p.13557-13603. ![]() [14]Liu ZW, Luo P, Wang XG, et al., 2015. Deep learning face attributes in the wild. Proc IEEE Int Conf on Computer Vision, p.3730-3738. ![]() [15]Lopez-Paz D, Nishihara R, Chintala S, et al., 2017. Discovering causal signals in images. Proc IEEE Conf on Computer Vision and Pattern Recognition, p.6979-6987. ![]() [16]Lu CC, Wu YH, Hernández-Lobato JM, et al., 2021. Nonlinear invariant risk minimization: a causal approach. https://arxiv.org/abs/2102.12353 ![]() [17]Lv FR, Liang J, Li S, et al., 2022. Causality inspired representation learning for domain generalization. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.8046-8056. ![]() [18]Moraffah R, Moraffah B, Karami M, et al., 2020. Causal adversarial network for learning conditional and interventional distributions. https://arxiv.org/abs/2008.11376 ![]() [19]Ng I, Zhu SY, Chen ZT, et al., 2019. A graph autoencoder approach to causal structure learning. https://arxiv.org/abs/1911.07420 ![]() [20]Ng I, Zhu S, Fang Z, et al., 2022. Masked gradient-based causal structure learning. Proc SIAM Int Conf on Data Mining, p.424-432. ![]() [21]Pan YH, Li ZC, Zhang LY, et al., 2022. Causal inference with knowledge distilling and curriculum learning for unbiased VQA. ACM Trans Multim Comput Commun Appl, 18(3):67. ![]() [22]Petkov H, Hanley C, Dong F, 2022. DAG-WGAN: causal structure learning with Wasserstein generative adversarial networks. https://arxiv.org/abs/2204.00387 ![]() [23]Reinhold JC, Carass A, Prince JL, 2021. A structural causal model for MR images of multiple sclerosis. Proc 24th Int Conf on Medical Image Computing and Computer-Assisted Intervention, p.782-792. ![]() [24]Salimans T, Goodfellow I, Zaremba W, et al., 2016. Improved techniques for training GANs. Proc 30th Int Conf on Neural Information Processing Systems, p.2234-2242. ![]() [25]Sanchez P, Tsaftaris SA, 2022. Diffusion causal models for counterfactual estimation. Proc 1st Conf on Causal Learning and Reasoning, p.647-668. ![]() [26]Sanchez P, Kascenas A, Liu X, et al., 2022. What is healthy? Generative counterfactual diffusion for lesion localization. Proc 2nd MICCAI Workshop on Deep Generative Models, p.34-44. ![]() [27]Sauer A, Geiger A, 2021. Counterfactual generative networks. Proc 9th Int Conf on Learning Representations. ![]() [28]Schölkopf B, Locatello F, Bauer S, et al., 2021. Toward causal representation learning. Proc IEEE, 109(5):612-634. ![]() [29]Shen XW, Liu FR, Dong HZ, et al., 2022. Weakly supervised disentangled generative causal representation learning. J Mach Learn Res, 23(1):241. ![]() [30]Shen YJ, Zhou BL, 2021. Closed-form factorization of latent semantics in GANs. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.1532-1540. ![]() [31]Shen YJ, Yang CY, Tang XO, et al., 2022. InterFaceGAN: interpreting the disentangled face representation learned by GANs. IEEE Trans Patt Anal Mach Intell, 44(4):2004-2018. ![]() [32]Sun YP, Chen Q, He XY, et al., 2022. Singular value fine-tuning: few-shot segmentation requires few-parameters fine-tuning. Proc 36th Advances in Neural Information Processing Systems, p.37484-37496. ![]() [33]Suter R, Miladinovic D, Schölkopf B, et al., 2019. Robustly disentangled causal mechanisms: validating deep representations for interventional robustness. Proc 36th Int Conf on Machine Learning, p.6056-6065. ![]() [34]Varando G, 2020. Learning DAGs without imposing acyclicity. https://arxiv.org/abs/2006.03005v1 ![]() [35]Vowels MJ, Camgoz NC, Bowden R, 2023. D’ya like DAGs? A survey on structure learning and causal discovery. ACM Comput Surv, 55(4):82. ![]() [36]Wang WJ, Lin XY, Feng FL, et al., 2022. Causal representation learning for out-of-distribution recommendation. Proc ACM Web Conf, p.3562-3571. ![]() [37]Wang XQ, Du YL, Zhu SY, et al., 2021. Ordering-based causal discovery with reinforcement learning. Proc 30th Int Joint Conf on Artificial Intelligence, p.3566-3573. ![]() [38]Wang YF, Zhu YL, Hang TT, et al., 2021. Incorporating proportional sparse penalty for causal structure learning. Proc IEEE 33rd Int Conf on Tools with Artificial Intelligence, p.105-112. ![]() [39]Wei D, Gao T, Yu Y, 2020. DAGs with no fears: a closer look at continuous optimization for learning Bayesian networks. Proc 34th Int Conf on Neural Information Processing Systems, p.328. ![]() [40]Xia WH, Zhang YL, Yang YJ, et al., 2023. GAN inversion: a survey. IEEE Trans Patt Anal Mach Intell, 45(3):3121-3138. ![]() [41]Yang MY, Liu FR, Chen ZT, et al., 2021. CausalVAE: disentangled representation learning via neural structural causal models. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.9593-9602. ![]() [42]Yu Y, Chen J, Gao T, et al., 2019. DAG-GNN: DAG structure learning with graph neural networks. Proc 36th Int Conf on Machine Learning, p.7154-7163. ![]() [43]Zhang LM, Rao A, Agrawala M, 2023. Adding conditional control to text-to-image diffusion models. https://arxiv.org/abs/2302.05543 ![]() [44]Zhang WB, Liao J, Zhang Y, et al., 2022. CMGAN: a generative adversarial network embedded with causal matrix. Appl Intell, 52(14):16233-16245. ![]() [45]Zhang XH, Wong Y, Wu XF, et al., 2021. Learning causal representation for training cross-domain pose estimator via generative interventions. Proc IEEE/CVF Int Conf on Computer Vision, p.11270-11280. ![]() [46]Zheng X, Aragam B, Ravikumar P, et al., 2018. DAGs with NO TEARS: continuous optimization for structure learning. Proc 32nd Int Conf on Neural Information Processing Systems, p.9492-9503. ![]() [47]Zhu JG, Xie HC, AbdAlmageed W, 2022. Do-operation guided causal representation learning with reduced supervision strength. https://arxiv.org/abs/2206.01802v1 ![]() [48]Zhu SY, Ng I, Chen ZT, 2020. Causal discovery with reinforcement learning. Proc 8th Int Conf on Learning Representations. ![]() Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou
310027, China
Tel: +86-571-87952783; E-mail: cjzhang@zju.edu.cn Copyright © 2000 - 2025 Journal of Zhejiang University-SCIENCE |
Open peer comments: Debate/Discuss/Question/Opinion
<1>