Full Text:   <1327>

CLC number: TP309.5

On-line Access: 2019-07-08

Received: 2018-08-31

Revision Accepted: 2019-01-05

Crosschecked: 2019-06-11

Cited: 0

Clicked: 3316

Citations:  Bibtex RefMan EndNote GB/T7714

 ORCID:

Bing-lin Zhao

http://orcid.org/0000-0001-5948-9195

-   Go to

Article info.
Open peer comments

Frontiers of Information Technology & Electronic Engineering  2019 Vol.20 No.6 P.801-815

http://doi.org/10.1631/FITEE.1800523


Malware homology identification based on a gene perspective


Author(s):  Bing-lin Zhao, Zheng Shan, Fu-dong Liu, Bo Zhao, Yi-hang Chen, Wen-jie Sun

Affiliation(s):  State Key Laboratory of Mathematical Engineering and Advanced Computing, Zhengzhou 450001, China; more

Corresponding email(s):   gzu_zhaobl@126.com, zzzhengming@163.com, lwfydy@126.com

Key Words:  Malware classification, Gene perspective, Dependency graph, Homology analysis


Bing-lin Zhao, Zheng Shan, Fu-dong Liu, Bo Zhao, Yi-hang Chen, Wen-jie Sun. Malware homology identification based on a gene perspective[J]. Frontiers of Information Technology & Electronic Engineering, 2019, 20(6): 801-815.

@article{title="Malware homology identification based on a gene perspective",
author="Bing-lin Zhao, Zheng Shan, Fu-dong Liu, Bo Zhao, Yi-hang Chen, Wen-jie Sun",
journal="Frontiers of Information Technology & Electronic Engineering",
volume="20",
number="6",
pages="801-815",
year="2019",
publisher="Zhejiang University Press & Springer",
doi="10.1631/FITEE.1800523"
}

%0 Journal Article
%T Malware homology identification based on a gene perspective
%A Bing-lin Zhao
%A Zheng Shan
%A Fu-dong Liu
%A Bo Zhao
%A Yi-hang Chen
%A Wen-jie Sun
%J Frontiers of Information Technology & Electronic Engineering
%V 20
%N 6
%P 801-815
%@ 2095-9184
%D 2019
%I Zhejiang University Press & Springer
%DOI 10.1631/FITEE.1800523

TY - JOUR
T1 - Malware homology identification based on a gene perspective
A1 - Bing-lin Zhao
A1 - Zheng Shan
A1 - Fu-dong Liu
A1 - Bo Zhao
A1 - Yi-hang Chen
A1 - Wen-jie Sun
J0 - Frontiers of Information Technology & Electronic Engineering
VL - 20
IS - 6
SP - 801
EP - 815
%@ 2095-9184
Y1 - 2019
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/FITEE.1800523


Abstract: 
Malware homology identification is important in attacking event tracing, emergency response scheme generation, and event trend prediction. Current malware homology identification methods still rely on manual analysis, which is inefficient and cannot respond quickly to the outbreak of attack events. In response to these problems, we propose a new malware homology identification method from a gene perspective. A malware gene is represented by the subgraph, which can describe the homology of malware families. We extract the key subgraph from the function dependency graph as the malware gene by selecting the key application programming interface (API) and using the community partition algorithm. Then, we encode the gene and design a frequent subgraph mining algorithm to find the common genes between malware families. Finally, we use the family genes to guide the identification of malware based on homology. We evaluate our method with a public dataset, and the experiment results show that the accuracy of malware classification reaches 97% with high efficiency.

基于基因视角的恶意代码同源性判定

摘要:恶意代码同源性判定对攻击事件溯源、应急响应方案处置以及事件发展趋势预测有重要作用。目前,恶意代码同源性判定以人工分析为主,效率较低,对安全事件的爆发无法快速响应。因此,提出一种新的从基因视角分析的恶意代码同源性判定方法。恶意代码基因由表示家族同源性的子图组成。通过筛选关键应用程序接口和利用社团划分算法,从函数依赖图中提取关键子图作为恶意代码基因。然后,设计一种频繁子图挖掘算法发现恶意代码家族的共有基因,并对基因编码。最后,利用家族共有基因指导恶意代码同源性判定。对公开数据集的分类和实验结果表明,分类准确率达97%,且效率较高。

关键词:恶意代码分类;基因视角;函数依赖图;同源性分析

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article

Reference

[1]Alam S, Horspool RN, Traore I, 2013. MAIL: Malware Analysis Intermediate Language: a step towards automating and optimizing malware detection. Proc 6th Int Conf on Security of Information and Networks, p.233-240.

[2]Alam S, Horspool RN, Traore I, 2014. MARD: a framework for metamorphic malware analysis and real-time detection. 28th Int Conf on Advanced Information Networking and Applications, p.212-233.

[3]Cesare S, Xiang Y, Zhou WL, 2013. Malwise—an effective and efficient classification system for packed and polymorphic malware. IEEE Trans Comput, 62(6):1193-1206.

[4]Defferrard M, Bresson X, Vandergheynst P, 2016. Convolutional neural networks on graphs with fast localized spectral filtering. Conf and Workshop on Neural Information Processing Systems, p.3837-3845.

[5]Drew J, Moore T, Hahsler M, 2016. Polymorphic malware detection using sequence classification methods. Security and Privacy Workshops, p.81-87.

[6]Han J, Zhao RC, Shan Z, et al., 2018. Analyzing and recognizing Android malware via semantic-based malware gene. Int Conf on Cyber-Enabled Distributed Computing and Knowledge Discovery, p.17-20.

[7]Jang JW, Woo J, Yun J, et al., 2014. Mal-netminer: malware classification based on social network analysis of call graph. Proc 23rd Int Conf on World Wide Web, p.731-734.

[8]Kaggle, 2015. Microsoft Malware Classification Challenge (Big 2015). https://www.kaggle.com/c/malware-classification [Ac cessed on Nov. 4, 2015].

[9]Kinable J, Kostakis O, 2011. Malware classification based on call graph clustering. J Comput Virol, 7(4):233-245.

[10]Kipf TN, Welling M, 2016. Semi-supervised classification with graph convolutional networks. https://arxiv.org/abs/1609.02907?context=cs

[11]Kirat D, Vigna G, 2015. MalGene: automatic extraction of malware analysis evasion signature. Proc 22nd ACM SIGSAC Conf on Computer and Communications Security, p.769-780.

[12]Liu L, Wang BS, Yu B, et al., 2017. Automatic malware classification and new malware detection using machine learning. Front Inform Technol Electron Eng, 18(9): 1336-1347.

[13]Naval S, Laxmi V, Rajarajan M, et al., 2017. Employing program semantics for malware detection. IEEE Trans Inform Forens Secur, 10(12):2591-2604.

[14]Qiao YC, Yun XC, Zhang YZ, et al., 2016. An automatic malware homology identification method based on calling habits. Acta Electron Sin, 44(10):2410-2414.

[15]Qihoo 360, 2017. Ransomware Threat Situation Analysis Report. http://zt.360.cn/1101061855.php?dtid=11010623 60&did=490927082

[16]Wang XZ, Liu JW, Chen XE, 2015. Microsoft Malware Classification Challenge (Big 2015) first place team: say no to overfitting. https://github.com/xiaozhouwang/kaggle_Microsoft_Malware/blob/master/Saynotooverfitting.pdf [Accessed on Nov. 2, 2015].

[17]Wu J, Dong MX, Ota K, et al., 2018a. Big data analysis-based secure cluster management for optimized control plane in software-defined networks. IEEE Trans Network Ser Manag, 15(1):27-38.

[18]Wu J, Luo SB, Wang S, et al., 2018b. NLES: a novel lifetime extension scheme for safety-critical cyber-physical systems using SDN and NFV. IEEE Int Things J, 6(2):2463-2475.

[19]Yu B, Fang Y, Yang Q, et al., 2018. A survey of malware behavior description and analysis. Front Inform Technol Electron Eng, 19(5):583-603.

Open peer comments: Debate/Discuss/Question/Opinion

<1>

Please provide your name, email address and a comment





Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou 310027, China
Tel: +86-571-87952783; E-mail: cjzhang@zju.edu.cn
Copyright © 2000 - 2022 Journal of Zhejiang University-SCIENCE