Monday, December 29, 2008

Automated Protocol Format Recovery and Automated Protocol Control Flow Recovery

Some recent (2008) research results on "Automatic Protocol Format Reverse Engineering through Context-Aware Monitored Execution" and "Automatic Network Protocol Analysis" both look like some promising extensions of the work done for Microsoft Discoverer and (apparently defunct) Protocol Informatics approaches for protocol format recognition. A circa 2007 paper titled "Polyglot: Automatic Extraction of Protocol Message Format Using Dynamic Binary Analysis" also describes an approach to protocol recovery. Interesting to me because I did some work on automated recovery of control flows (see).

Might be interesting to combine the two (automated format recovery and automated control flow recovery). Unfortunately, it appears that the approach used in Microsoft Discoverer are going to be patented.

An overarching approach is what I call Dynamic Protocol Reverse Engineering (DPRE). This idea is derived from protocol conformance checking. I'd like to end up with an automated/semi-automated processing pipeline. An overview of a DPRE Framework is depicted here:
Conceptual DPRE Framework


Above, we have some of the ground work built for for automated model recovery. First we collect a "pile-of-packets" for the protocol implementation under inspection. Process the packets to recover an implementation model. Next the recovered models could be processed by automated verification tools to check their performance characteristics. A list of several formal verification methods and tools is located here. After model generation and model verification we could perform vulnerability assessment and generation of targeted effects (i.e. exploits). Finally, the targeted effects could be used to test a protocol implementation in a hostile environment.