Publishing Service

Polishing & Checking

Frontiers of Information Technology & Electronic Engineering

ISSN 2095-9184 (print), ISSN 2095-9230 (online)

Active inference of protocol state machines from incomplete message domains

Abstract: Inferring protocol state machines from observable information presents a significant challenge in protocol reverse engineering (PRE), especially when passively collected traffic suffers from message loss, resulting in an incomplete protocol state space. This paper introduces an innovative method for actively inferring protocol state machines using the minimally adequate teacher (MAT) framework. By incorporating session completion and deterministic mutation techniques, this method broadens the range of protocol messages, thereby constructing a more comprehensive input space for the protocol state machine from an incomplete message domain. Additionally, the efficiency of active inference is improved through several optimizations for the LM+ algorithm, including traffic deduplication, the construction of an expanded prefix tree acceptor (EPTA), query optimization based on responses, and random counterexample generation. Experiments on the real-time streaming protocol (RTSP) and simple mail transfer protocol (SMTP), which use Live555 and Exim implementations across multiple versions, demonstrate that this method yields more comprehensive protocol state machines with enhanced execution efficiency. Compared to the LM+ algorithm implemented by AALpy, Act_Infer achieves an average reduction of approximately 40.7% in execution time and significantly reduces the number of connections and interactions by approximately 28.6% and 46.6%, respectively.

Key words: Protocol reverse engineering (PRE); Protocol state machine; Active inference; Incomplete message domains; Input space

Chinese Summary  <0> 基于不完备消息域的协议状态机主动推断

郭茂华,祝跃飞,费金龙
网络空间安全教育部重点实验室,中国郑州市,450001
摘要:通过可观察到的信息实现协议状态机的推断是协议逆向工程(PRE)中的一个重大挑战,特别是当被动收集的流量因报文缺失而导致协议状态空间不完整时。本文基于最少充足教师(MAT)框架提出了一种新的协议状态机主动推断方法。结合会话补全和确定性变异技术,该方法拓展了协议报文类型,从而基于不完备消息域构建了更全面的协议状态机输入空间。此外,通过对算法的优化,包括流量去重、扩展前缀树接受器(EPTA)的构建、基于响应的查询优化、基于状态转移的随机反例生成等,主动推断的效率得到提升。基于Live555和Exim多个版本的实现,针对实时流协议(RTSP)和简单邮件传输协议(SMTP)的实验表明,该方法能够以更高的执行效率推断出更完善的协议状态机。相较于AALpy实现的算法,Act_Infer的执行时间平均降低了约40.7%,连接次数和交互次数分别降低了约28.6%和46.6%。

关键词组:协议逆向工程(PRE);协议状态机;主动推断;不完备消息域;输入空间


Share this article to: More

Go to Contents

References:

<Show All>

Open peer comments: Debate/Discuss/Question/Opinion

<1>

Please provide your name, email address and a comment





DOI:

10.1631/FITEE.2400487

CLC number:

TP393

Download Full Text:

Click Here

Downloaded:

2627

Clicked:

1727

Cited:

0

On-line Access:

2026-01-09

Received:

2024-06-06

Revision Accepted:

2024-09-06

Crosschecked:

2026-01-11

Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou 310027, China
Tel: +86-571-87952276; Fax: +86-571-87952331; E-mail: jzus@zju.edu.cn
Copyright © 2000~ Journal of Zhejiang University-SCIENCE