Abstract: Inferring protocol state machines from observable information presents a significant challenge in protocol reverse engineering (PRE), especially when passively collected traffic suffers from message loss, resulting in an incomplete protocol state space. This paper introduces an innovative method for actively inferring protocol state machines using the MAT framework. By incorporating session completion and deterministic mutation techniques, this method broadens the range of protocol messages, thereby constructing a more comprehensive input space for the protocol state machine from an incomplete message domain. Additionally, the efficiency of active inference is improved through several optimizations, including traffic deduplication, the construction of an Expanded Prefix Tree Acceptor (EPTA), query optimization based on responses, and random counterexample generation for the algorithm. Experiments on the RTSP and SMTP protocols, using Live555 and EXIM implementations across multiple versions, demonstrate that this approach yields more comprehensive protocol state machines with enhanced execution efficiency. Compared to the algorithm implemented by AALpy, Act_Infer achieves an average reduction of 40% in execution time, and significantly reduces connection and interaction times by 25% and 50%, respectively.
Open peer comments: Debate/Discuss/Question/Opinion
<1>