Publishing Service

Polishing & Checking

Frontiers of Information Technology & Electronic Engineering

ISSN 2095-9184 (print), ISSN 2095-9230 (online)

Multistage guidance on the diffusion model inspired by human artists’ creative thinking

Abstract: Current research on text-conditional image generation shows parallel performance with ordinary painters but still has much room for improvement when compared to that of artist-ability paintings, which usually represent multilevel semantics by gathering features ofmultiple objects into one object. In a preliminary experiment, we confirm this and then seek the opinions of three groups of individuals with varying levels of art appreciation ability to determine the distinctions that exist between painters and artists. We then use these opinions to improve an artificial intelligence (AI) painting system from painter-level image generation toward artistic-level image generation. Specifically, we propose a multistage text-conditioned approach without any further pretraining to help the diffusion model (DM) move toward multilevel semantic representation in a generated image. Both machine and manual evaluations of the main experiment verify the effectiveness of our approach. In addition, different from previous onestage guidance, our method is able to control the extent to which features of an object are represented in a painting by controlling guiding steps between the different stages.

Key words:

Chinese Summary  <2> 受艺术家创造性思维启发的扩散模型多阶段引导

齐旺1,邓晃煌2,李太豪1
1之江实验室跨媒体智能研究中心,中国杭州市,311500
2浙江大学计算机科学与技术学院,中国杭州市,310027
摘要:目前文本生成图像的研究已显示出与普通画家类似的水平,但与艺术家绘画水平相比仍有很大改进空间;艺术家水平的绘画通常将多个意象的特征融合到一个意象中,以表示多层次语义信息。在预实验中,我们证实了这一点,并咨询了3个具有不同艺术欣赏能力的群体的意见,以确定画家和艺术家之间绘画水平的区别。之后,利用这些观点帮助人工智能绘画系统从普通画家水平的图像生成改进为艺术家水平的图像生成。具体来说,提出一种无需任何进一步预训练的、基于文本的多阶段引导方法,帮助扩散模型在生成的图像中向多层次语义表示迈进。实验中的机器和人工评估都验证了所提方法的有效性。此外,与之前单阶段引导方法不同,该方法能够通过控制不同阶段之间的指导步数来控制各个意象特征在绘画中的表现程度。

关键词组:文本生成图像;扩散模型;多层次语义;多阶段引导


Share this article to: More

Go to Contents

References:

<Show All>

Open peer comments: Debate/Discuss/Question/Opinion

<1>

Please provide your name, email address and a comment





DOI:

10.1631/FITEE.2300313

CLC number:

Download Full Text:

Click Here

Downloaded:

253

Download summary:

<Click Here> 

Downloaded:

125

Clicked:

411

Cited:

0

On-line Access:

2024-02-19

Received:

2023-04-30

Revision Accepted:

2024-02-19

Crosschecked:

2023-10-13

Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou 310027, China
Tel: +86-571-87952276; Fax: +86-571-87952331; E-mail: jzus@zju.edu.cn
Copyright © 2000~ Journal of Zhejiang University-SCIENCE