|
Frontiers of Information Technology & Electronic Engineering
ISSN 2095-9184 (print), ISSN 2095-9230 (online)
2023 Vol.24 No.7 P.1007-1027
Explainable data transformation recommendation for automatic visualization
Abstract: Automatic visualization generates meaningful visualizations to support data analysis and pattern finding for novice or casual users who are not familiar with visualization design. Current automatic visualization approaches adopt mainly aggregation and filtering to extract patterns from the original data. However, these limited data transformations fail to capture complex patterns such as clusters and correlations. Although recent advances in feature engineering provide the potential for more kinds of automatic data transformations, the auto-generated transformations lack explainability concerning how patterns are connected with the original features. To tackle these challenges, we propose a novel explainable recommendation approach for extended kinds of data transformations in automatic visualization. We summarize the space of feasible data transformations and measures on explainability of transformation operations with a literature review and a pilot study, respectively. A recommendation algorithm is designed to compute optimal transformations, which can reveal specified types of patterns and maintain explainability. We demonstrate the effectiveness of our approach through two cases and a user study.
Key words: Data transformation; Data transformation recommendation; Automatic visualization; Explainability
1浙江大学计算机辅助设计与图形学国家重点实验室,中国杭州市,310058
2南方科技大学计算机科学与工程系,中国深圳市,518055
3中南大学计算机学院,中国长沙市,410083
摘要:自动可视化技术能够为不熟悉可视化设计的用户生成有意义的可视化,以支持他们的数据分析和模式发现需求。当前,主流的自动可视化方法采用聚合与过滤从原始数据抽取模式信息。然而,这些有限的数据变换并不能捕获聚类、关联等复杂的模式。尽管特征工程领域的最新进展为更加广泛的自动数据变换提供了可能,其结果却缺少可解释性,导致变换后的模式无法与原始数据特征建立联系。为应对上述挑战,我们面向自动可视化中广泛的数据变换类型,提出一种创新的可解释推荐方法。我们通过回顾既往文献总结可行的数据变换空间,通过开展预实验总结变换可解释性的度量。我们的推荐算法能够计算最优的数据变换,这种变换能够在维持可解释性的同时揭示数据的模式信息。真实场景下的使用案例与用户实验验证了我们方法的有效性。
关键词组:
References:
Open peer comments: Debate/Discuss/Question/Opinion
<1>
DOI:
10.1631/FITEE.2200409
CLC number:
TP391
Download Full Text:
Downloaded:
3403
Download summary:
<Click Here>Downloaded:
442Clicked:
1631
Cited:
0
On-line Access:
2024-08-27
Received:
2023-10-17
Revision Accepted:
2024-05-08
Crosschecked:
2022-12-12