JZUS - Journal of Zhejiang University SCIENCE

Frontiers of Information Technology & Electronic Engineering

ISSN 2095-9184 (print), ISSN 2095-9230 (online)

2024 Vol.25 No.1 P.135-148

Controllable image generation based on causal representation learning

Shanshan HUANG, Yuanhao WANG, Zhili GONG, Jun LIAO, Shu WANG, Li LIU

School of Big Data and Software Engineering, Chongqing University, Chongqing 401331, China; School of Materials and Energy, Southwest University, Chongqing 400715, China

shanshanhuang@cqu.edu.cn, dcsliuli@cqu.edu.cn

Abstract: Artificial intelligence generated content (AIGC) has emerged as an indispensable tool for producing large-scale content in various forms, such as images, thanks to the significant role that AI plays in imitation and production. However, interpretability and controllability remain challenges. Existing AI methods often face challenges in producing images that are both flexible and controllable while considering causal relationships within the images. To address this issue, we have developed a novel method for causal controllable image generation (CCIG) that combines causal representation learning with bi-directional generative adversarial networks (GANs). This approach enables humans to control image attributes while considering the rationality and interpretability of the generated images and also allows for the generation of counterfactual images. The key of our approach, CCIG, lies in the use of a causal structure learning module to learn the causal relationships between image attributes and joint optimization with the encoder, generator, and joint discriminator in the image generation module. By doing so, we can learn causal representations in image’s latent space and use causal intervention operations to control image generation. We conduct extensive experiments on a real-world dataset, CelebA. The experimental results illustrate the effectiveness of CCIG.

Key words: Image generation; Controllable image editing; Causal structure learning; Causal representation learning

Chinese Summary <19> 血红素加氧酶1与骨关节炎软骨下破骨细胞失活的相关研究

储淼^1,2，陈广东¹，陈楷^1,3，朱鹏飞¹，王振⁴，钱忠来¹，陶华强¹，徐耀增¹，耿德春¹
¹苏州大学第一附属医院骨科，中国苏州市，215006
²宜兴市人民医院骨科，中国宜兴市，214299
³海安人民医院骨科，中国海安市，226600
⁴上海交通大学医学院附属苏州九龙医院骨科，中国苏州市，215028
摘要：骨关节炎（OA）是一种老年慢性进行性骨关节病。破骨细胞活化在早期骨关节炎软骨下骨丢失的发生中起着至关重要的作用。然而，骨性关节炎中破骨细胞分化的具体机制尚不清楚。在本研究中，从基因表达综合库（GEO）中筛选了与OA疾病进展和破骨细胞活化相关的基因表达谱。采用GEO2R和Funrich分析工具寻找差异表达基因（DEGs）。富集分析结果表明，化学致癌作用、活性氧和氧化应激反应主要参与OA软骨下骨的破骨细胞分化。此外，还鉴定了14个与氧化应激相关的DEGs。选择排名第一的差异基因血红素加氧酶1（HMOX1）进行进一步验证。相关结果显示，OA软骨下骨破骨细胞活化过程中伴随着HMOX1的下调。在体外实验中发现，鼠尾草酚通过靶向HMOX1，上调抗氧化蛋白的表达来抑制破骨细胞的形成。同时，在体内发现鼠尾草酚通过抑制软骨下骨破骨细胞的激活来减轻OA的严重程度。综上所述，软骨下骨氧化还原失稳态引起的破骨细胞活化是骨性关节炎进展的重要途径。在软骨下破骨细胞中靶向HMOX1可为早期OA的治疗提供新的见解。

关键词组：破骨细胞；氧化应激；骨关节炎（OA）；血红素加氧酶1（HMOX1）；鼠尾草酚

Share this article to： More

Go to Contents

References:

<HIDE>

[1]Ahuja K, Mahajan D, Wang YX, et al., 2023. Interventional causal representation learning. Proc 43^th Int Conf on Machine Learning, p.372-407.

[2]Augustin M, Boreiko V, Croce F, et al., 2022. Diffusion visual counterfactual explanations. Proc 36^th Advances in Neural Information Processing Systems, p.364-377.

[3]Brehmer J, de Haan P, Lippe P, et al., 2022. Weakly supervised causal representation learning. Proc 36^th Advances in Neural Information Processing Systems, p.38319-38331.

[4]Gao YH, Shen L, Xia ST, 2021. DAG-GAN: causal structure learning with generative adversarial nets. Proc IEEE Int Conf on Acoustics, Speech and Signal Processing, p.3320-3324.

[5]He KM, Zhang XY, Ren SQ, et al., 2016. Deep residual learning for image recognition. Proc IEEE Conf on Computer Vision and Pattern Recognition, p.770-778.

[6]Heusel M, Ramsauer H, Unterthiner T, et al., 2017. GANs trained by a two time-scale update rule converge to a local Nash equilibrium. Proc 31^st Int Conf on Neural Information Processing Systems, p.6629-6640.

[7]Huang S, Li Q, Liao J, et al., 2023. An overview of controllable image synthesis: current challenges and future trends. SSRN, Article 4187269. https://ssrn.com/abstract=4187269

[8]Huang SS, Jin X, Jiang Q, et al., 2022. Deep learning for image colorization: current and future prospects. Eng Appl Artif Intell, 114:105006.

[9]Kocaoglu M, Snyder C, Dimakis AG, et al., 2018. CausalGAN: learning causal implicit generative models with adversarial training. Proc Int Conf on Learning Representations.

[10]Lachapelle S, Brouillard P, Deleu T, et al., 2020. Gradient-based neural DAG learning. Proc 8^th Int Conf on Learning Representations.

[11]Lai PK, 2022. DeepSCM: an efficient convolutional neural network surrogate model for the screening of therapeutic antibody viscosity. Comput Struct Biotechnol J, 20:2143-2152.

[12]Leeb F, Annadani Y, Bauer S, et al., 2020. Structural autoencoders improve representations for generation and transfer. https://arxiv.org/abs/2006.07796v1

[13]Lippe P, Magliacane S, Löwe S, et al., 2022. CITRIS: causal identifiability from temporal intervened sequences. Proc 39^th Int Conf on Machine Learning, p.13557-13603.

[14]Liu ZW, Luo P, Wang XG, et al., 2015. Deep learning face attributes in the wild. Proc IEEE Int Conf on Computer Vision, p.3730-3738.

[15]Lopez-Paz D, Nishihara R, Chintala S, et al., 2017. Discovering causal signals in images. Proc IEEE Conf on Computer Vision and Pattern Recognition, p.6979-6987.

[16]Lu CC, Wu YH, Hernández-Lobato JM, et al., 2021. Nonlinear invariant risk minimization: a causal approach. https://arxiv.org/abs/2102.12353

[17]Lv FR, Liang J, Li S, et al., 2022. Causality inspired representation learning for domain generalization. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.8046-8056.

[18]Moraffah R, Moraffah B, Karami M, et al., 2020. Causal adversarial network for learning conditional and interventional distributions. https://arxiv.org/abs/2008.11376

[19]Ng I, Zhu SY, Chen ZT, et al., 2019. A graph autoencoder approach to causal structure learning. https://arxiv.org/abs/1911.07420

[20]Ng I, Zhu S, Fang Z, et al., 2022. Masked gradient-based causal structure learning. Proc SIAM Int Conf on Data Mining, p.424-432.

[21]Pan YH, Li ZC, Zhang LY, et al., 2022. Causal inference with knowledge distilling and curriculum learning for unbiased VQA. ACM Trans Multim Comput Commun Appl, 18(3):67.

[22]Petkov H, Hanley C, Dong F, 2022. DAG-WGAN: causal structure learning with Wasserstein generative adversarial networks. https://arxiv.org/abs/2204.00387

[23]Reinhold JC, Carass A, Prince JL, 2021. A structural causal model for MR images of multiple sclerosis. Proc 24^th Int Conf on Medical Image Computing and Computer-Assisted Intervention, p.782-792.

[24]Salimans T, Goodfellow I, Zaremba W, et al., 2016. Improved techniques for training GANs. Proc 30^th Int Conf on Neural Information Processing Systems, p.2234-2242.

[25]Sanchez P, Tsaftaris SA, 2022. Diffusion causal models for counterfactual estimation. Proc 1^st Conf on Causal Learning and Reasoning, p.647-668.

[26]Sanchez P, Kascenas A, Liu X, et al., 2022. What is healthy? Generative counterfactual diffusion for lesion localization. Proc 2^nd MICCAI Workshop on Deep Generative Models, p.34-44.

[27]Sauer A, Geiger A, 2021. Counterfactual generative networks. Proc 9^th Int Conf on Learning Representations.

[28]Schölkopf B, Locatello F, Bauer S, et al., 2021. Toward causal representation learning. Proc IEEE, 109(5):612-634.

[29]Shen XW, Liu FR, Dong HZ, et al., 2022. Weakly supervised disentangled generative causal representation learning. J Mach Learn Res, 23(1):241.

[30]Shen YJ, Zhou BL, 2021. Closed-form factorization of latent semantics in GANs. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.1532-1540.

[31]Shen YJ, Yang CY, Tang XO, et al., 2022. InterFaceGAN: interpreting the disentangled face representation learned by GANs. IEEE Trans Patt Anal Mach Intell, 44(4):2004-2018.

[32]Sun YP, Chen Q, He XY, et al., 2022. Singular value fine-tuning: few-shot segmentation requires few-parameters fine-tuning. Proc 36^th Advances in Neural Information Processing Systems, p.37484-37496.

[33]Suter R, Miladinovic D, Schölkopf B, et al., 2019. Robustly disentangled causal mechanisms: validating deep representations for interventional robustness. Proc 36^th Int Conf on Machine Learning, p.6056-6065.

[34]Varando G, 2020. Learning DAGs without imposing acyclicity. https://arxiv.org/abs/2006.03005v1

[35]Vowels MJ, Camgoz NC, Bowden R, 2023. D’ya like DAGs? A survey on structure learning and causal discovery. ACM Comput Surv, 55(4):82.

[36]Wang WJ, Lin XY, Feng FL, et al., 2022. Causal representation learning for out-of-distribution recommendation. Proc ACM Web Conf, p.3562-3571.

[37]Wang XQ, Du YL, Zhu SY, et al., 2021. Ordering-based causal discovery with reinforcement learning. Proc 30^th Int Joint Conf on Artificial Intelligence, p.3566-3573.

[38]Wang YF, Zhu YL, Hang TT, et al., 2021. Incorporating proportional sparse penalty for causal structure learning. Proc IEEE 33^rd Int Conf on Tools with Artificial Intelligence, p.105-112.

[39]Wei D, Gao T, Yu Y, 2020. DAGs with no fears: a closer look at continuous optimization for learning Bayesian networks. Proc 34^th Int Conf on Neural Information Processing Systems, p.328.

[40]Xia WH, Zhang YL, Yang YJ, et al., 2023. GAN inversion: a survey. IEEE Trans Patt Anal Mach Intell, 45(3):3121-3138.

[41]Yang MY, Liu FR, Chen ZT, et al., 2021. CausalVAE: disentangled representation learning via neural structural causal models. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.9593-9602.

[42]Yu Y, Chen J, Gao T, et al., 2019. DAG-GNN: DAG structure learning with graph neural networks. Proc 36^th Int Conf on Machine Learning, p.7154-7163.

[43]Zhang LM, Rao A, Agrawala M, 2023. Adding conditional control to text-to-image diffusion models. https://arxiv.org/abs/2302.05543

[44]Zhang WB, Liao J, Zhang Y, et al., 2022. CMGAN: a generative adversarial network embedded with causal matrix. Appl Intell, 52(14):16233-16245.

[45]Zhang XH, Wong Y, Wu XF, et al., 2021. Learning causal representation for training cross-domain pose estimator via generative interventions. Proc IEEE/CVF Int Conf on Computer Vision, p.11270-11280.

[46]Zheng X, Aragam B, Ravikumar P, et al., 2018. DAGs with NO TEARS: continuous optimization for structure learning. Proc 32^nd Int Conf on Neural Information Processing Systems, p.9492-9503.

[47]Zhu JG, Xie HC, AbdAlmageed W, 2022. Do-operation guided causal representation learning with reduced supervision strength. https://arxiv.org/abs/2206.01802v1

[48]Zhu SY, Ng I, Chen ZT, 2020. Causal discovery with reinforcement learning. Proc 8^th Int Conf on Learning Representations.

Open peer comments: Debate/Discuss/Question/Opinion

<1>

DOI:

10.1631/FITEE.2300303

CLC number:

TP391.41

Download Full Text:

Click Here

Downloaded:

2009

Download summary:

Downloaded:

540

Clicked:

1991

Cited:

On-line Access:

2024-08-27

Received:

2023-10-17

Revision Accepted:

2024-05-08

Crosschecked:

2023-10-13

Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou 310027, China
Tel: +86-571-87952276; Fax: +86-571-87952331; E-mail: jzus@zju.edu.cn
Copyright © 2000~ Journal of Zhejiang University-SCIENCE

CONTENTS

INSTR. FOR AUTHOR

FOR REVIEWER

ABOUT JZUS

Publishing Service