CLC number: TP309.5
On-line Access: 2019-07-08
Received: 2018-08-31
Revision Accepted: 2019-01-05
Crosschecked: 2019-06-11
Cited: 0
Clicked: 5804
Bing-lin Zhao, Zheng Shan, Fu-dong Liu, Bo Zhao, Yi-hang Chen, Wen-jie Sun. Malware homology identification based on a gene perspective[J]. Frontiers of Information Technology & Electronic Engineering, 2019, 20(6): 801-815.
@article{title="Malware homology identification based on a gene perspective",
author="Bing-lin Zhao, Zheng Shan, Fu-dong Liu, Bo Zhao, Yi-hang Chen, Wen-jie Sun",
journal="Frontiers of Information Technology & Electronic Engineering",
volume="20",
number="6",
pages="801-815",
year="2019",
publisher="Zhejiang University Press & Springer",
doi="10.1631/FITEE.1800523"
}
%0 Journal Article
%T Malware homology identification based on a gene perspective
%A Bing-lin Zhao
%A Zheng Shan
%A Fu-dong Liu
%A Bo Zhao
%A Yi-hang Chen
%A Wen-jie Sun
%J Frontiers of Information Technology & Electronic Engineering
%V 20
%N 6
%P 801-815
%@ 2095-9184
%D 2019
%I Zhejiang University Press & Springer
%DOI 10.1631/FITEE.1800523
TY - JOUR
T1 - Malware homology identification based on a gene perspective
A1 - Bing-lin Zhao
A1 - Zheng Shan
A1 - Fu-dong Liu
A1 - Bo Zhao
A1 - Yi-hang Chen
A1 - Wen-jie Sun
J0 - Frontiers of Information Technology & Electronic Engineering
VL - 20
IS - 6
SP - 801
EP - 815
%@ 2095-9184
Y1 - 2019
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/FITEE.1800523
Abstract: Malware homology identification is important in attacking event tracing, emergency response scheme generation, and event trend prediction. Current malware homology identification methods still rely on manual analysis, which is inefficient and cannot respond quickly to the outbreak of attack events. In response to these problems, we propose a new malware homology identification method from a gene perspective. A malware gene is represented by the subgraph, which can describe the homology of malware families. We extract the key subgraph from the function dependency graph as the malware gene by selecting the key application programming interface (API) and using the community partition algorithm. Then, we encode the gene and design a frequent subgraph mining algorithm to find the common genes between malware families. Finally, we use the family genes to guide the identification of malware based on homology. We evaluate our method with a public dataset, and the experiment results show that the accuracy of malware classification reaches 97% with high efficiency.
[1]Alam S, Horspool RN, Traore I, 2013. MAIL: Malware Analysis Intermediate Language: a step towards automating and optimizing malware detection. Proc 6th Int Conf on Security of Information and Networks, p.233-240.
[2]Alam S, Horspool RN, Traore I, 2014. MARD: a framework for metamorphic malware analysis and real-time detection. 28th Int Conf on Advanced Information Networking and Applications, p.212-233.
[3]Cesare S, Xiang Y, Zhou WL, 2013. Malwise—an effective and efficient classification system for packed and polymorphic malware. IEEE Trans Comput, 62(6):1193-1206.
[4]Defferrard M, Bresson X, Vandergheynst P, 2016. Convolutional neural networks on graphs with fast localized spectral filtering. Conf and Workshop on Neural Information Processing Systems, p.3837-3845.
[5]Drew J, Moore T, Hahsler M, 2016. Polymorphic malware detection using sequence classification methods. Security and Privacy Workshops, p.81-87.
[6]Han J, Zhao RC, Shan Z, et al., 2018. Analyzing and recognizing Android malware via semantic-based malware gene. Int Conf on Cyber-Enabled Distributed Computing and Knowledge Discovery, p.17-20.
[7]Jang JW, Woo J, Yun J, et al., 2014. Mal-netminer: malware classification based on social network analysis of call graph. Proc 23rd Int Conf on World Wide Web, p.731-734.
[8]Kaggle, 2015. Microsoft Malware Classification Challenge (Big 2015). https://www.kaggle.com/c/malware-classification [Ac cessed on Nov. 4, 2015].
[9]Kinable J, Kostakis O, 2011. Malware classification based on call graph clustering. J Comput Virol, 7(4):233-245.
[10]Kipf TN, Welling M, 2016. Semi-supervised classification with graph convolutional networks. https://arxiv.org/abs/1609.02907?context=cs
[11]Kirat D, Vigna G, 2015. MalGene: automatic extraction of malware analysis evasion signature. Proc 22nd ACM SIGSAC Conf on Computer and Communications Security, p.769-780.
[12]Liu L, Wang BS, Yu B, et al., 2017. Automatic malware classification and new malware detection using machine learning. Front Inform Technol Electron Eng, 18(9): 1336-1347.
[13]Naval S, Laxmi V, Rajarajan M, et al., 2017. Employing program semantics for malware detection. IEEE Trans Inform Forens Secur, 10(12):2591-2604.
[14]Qiao YC, Yun XC, Zhang YZ, et al., 2016. An automatic malware homology identification method based on calling habits. Acta Electron Sin, 44(10):2410-2414.
[15]Qihoo 360, 2017. Ransomware Threat Situation Analysis Report. http://zt.360.cn/1101061855.php?dtid=11010623 60&did=490927082
[16]Wang XZ, Liu JW, Chen XE, 2015. Microsoft Malware Classification Challenge (Big 2015) first place team: say no to overfitting. https://github.com/xiaozhouwang/kaggle_Microsoft_Malware/blob/master/Saynotooverfitting.pdf [Accessed on Nov. 2, 2015].
[17]Wu J, Dong MX, Ota K, et al., 2018a. Big data analysis-based secure cluster management for optimized control plane in software-defined networks. IEEE Trans Network Ser Manag, 15(1):27-38.
[18]Wu J, Luo SB, Wang S, et al., 2018b. NLES: a novel lifetime extension scheme for safety-critical cyber-physical systems using SDN and NFV. IEEE Int Things J, 6(2):2463-2475.
[19]Yu B, Fang Y, Yang Q, et al., 2018. A survey of malware behavior description and analysis. Front Inform Technol Electron Eng, 19(5):583-603.
Open peer comments: Debate/Discuss/Question/Opinion
<1>