From 0 To 0day Quick Fuzzing Lesson
Click Here --->>> https://urllio.com/2tfzck
Fuzzing network servers is a technical challenge, since the behavior of the target server depends on its state over a sequence of multiple messages. Existing solutions are costly and difficult to use, as they rely on manually-customized artifacts such as protocol models, protocol parsers, and learning frameworks. The aim of this work is to develop a greybox fuzzer (StateAFL) for network servers that only relies on lightweight analysis of the target program, with no manual customization, in a similar way to what the AFL fuzzer achieved for stateless programs. The proposed fuzzer instruments the target server at compile-time, to insert probes on memory allocations and network I/O operations. At run-time, it infers the current protocol state of the target server by taking snapshots of long-lived memory areas, and by applying a fuzzy hashing algorithm (Locality-Sensitive Hashing) to map memory contents to a unique state identifier. The fuzzer incrementally builds a protocol state machine for guiding fuzzing. We implemented and released StateAFL as open-source software. As a basis for reproducible experimentation, we integrated StateAFL with a large set of network servers for popular protocols, with no manual customization to accomodate for the protocol. The experimental results show that the fuzzer can be applied with no manual customization on a large set of network servers for popular protocols, and that it can achieve comparable, or even better code coverage and bug detection than customized fuzzing. Moreover, our qualitative analysis shows that states inferred from memory better reflect the server behavior than only using response codes from messages.
In this work, we propose a new solution for stateful coverage-driven fuzzing (StateAFL). Similarly to coverage-driven fuzzing, we inject code in the target binary using compile-time instrumentation techniques. The injected code infers protocol state information by: tracking memory allocations and network I/O operations; at each request-reply exchange, taking snapshots of long-lived memory areas; and applying fuzzy hashing (Locality-Sensitive Hashing, LSH) to map each in-memory state to a unique protocol state identifier. This approach does not rely on state information from network messages, and does not require developers to implement custom message parsers for extracting such state information. The aim of this approach is to contribute towards a completely-automated solution for stateful protocol fuzzing, similarly to what AFL was able to achieve for stateless programs, in order to promote a wider application of fuzzing in real-world systems. We note that fuzzing research achieved significant progress from the point of view of fuzzing algorithms, but we are still witnessing at critical vulnerabilities (e.g., the well-known case of Hearthbleed (Wheeler 2020)) that in hindsight could have been easily prevented with fuzzing. Moreover, empirical research also showed that fuzzing a new system for the first time is likely to find security bugs (Böhme and Falk 2020). For these reasons, it is now a priority to make fuzzing more broadly applicable, as it is still too difficult to setup fuzzing to target new systems. In the case of stateful network fuzzing, StateAFL overcomes the issues of writing custom parsers to extract individual requests from seed inputs, and to extract status codes from response messages from the target server. These issues make fuzzing less accessible for developers that are new to this technique, since they are not inclined to write more code to use a fuzzing tool unfamiliar to them. Moreover, StateAFL is even applicable for protocols that do not provide any explicit status code in the messages, such as in the TLS protocol in our experiments, or where the status code only represents the status of the last request executed by the server instead of the protocol state, as in FTP and HTTP.
To assess the feasibility of the approach, we implemented and publicly released StateAFL as open-source software. Moreover, to support reproducible experimentation, we integrated StateAFL with a publicly-available benchmark of 13 open-source network servers, the largest experimental setup among stateful network fuzzing studies to the best of our knowledge. Our proposed approach allowed us to integrate StateAFL with no manual customization of the fuzzer to accomodate for the protocols under test. The experimental evaluation shows that StateAFL is a robust approach that can be applied to diverse network servers without requiring any protocol customization. Moreover, StateAFL can achieve comparable, or even better code coverage and bug detection than previous solutions based on stateless coverage-driven fuzzing and on stateful, protocol-customized fuzzing. We also qualitatively analyze state information both from parsing response codes returned by the target server, and from inference based on long-lived data. We found that using response codes provides misleading representation of the protocol state, leading to redundant states in the inferred protocol state machine and wasted fuzz inputs.
Coverage-driven fuzzing techniques have been adopted by AFL, libFuzzer, and other derivative tools (Manès et al. 2019) as a more practical and automated solution. This form of fuzzing only relies on lightweight metrics collected from the target system at run-time (e.g., about code blocks and branches covered by the fuzz inputs), and iteratively mutates the fuzz inputs to maximize these metrics. Therefore, the fuzzer can start from an initial set of fuzz inputs (i.e., a seed corpus) to automatically evolve them, without any a-priori knowledge about the protocol.
Only recently, coverage-driven fuzzing has been investigated for stateful protocols. AFLnet (Pham et al. 2020) extended AFL for fuzzing network protocols, by: structuring fuzz inputs into messages and applying mutation operators at message-level (e.g., by corrupting, dropping or injecting individual messages in a session); by learning a protocol state machine, where states are represented by response codes from the system-under-test; and by using the protocol state machine to prioritize mutations. Snipuzz (Feng et al. 2021) tailored coverage-driven fuzzing to IoT protocols, where the system-under-test could not be instrumented to collect coverage information, because of lack of access to the firmware. Thus, Snipuzz also analyzes response codes, using them as indicators to identify sensitive bytes of the inputs (snippets) that trigger different paths in the target.
This paper proposes a new approach for stateful protocol fuzzing. Our approach infers a protocol state machine on the basis on richer feedback than traditional coverage-driven fuzzing. The approach is not limited to analyze response codes, since response codes may provide a poor indication of the current state of the server. For example, in an HTTP-based protocol, successful GET and POST requests may both receive the same response code (200), but POST requests may have side-effects on the state of the server, which are not reflected in the response code. Moreover, the protocol may lack response codes, such as in the case of TLS, thus leaving the fuzzer without any guidance about the current protocol state. Finally, even when response codes available, the fuzzer must be tailored for the target protocol, in order to extract and parse response codes from the response messages. For these reasons, our approach does not rely on response codes, but adopts compile-time instrumentation to get more information from the system-under-test and to infer the current protocol state. Moreover, the proposed approach relieves the user from providing custom message parsers.
We designed StateAFL to drive fuzzing based on protocol states covered during executions. In general terms, a protocol state guides the behavior of a process, by defining which actions the process is allowed to take, which events it expects to happen, and how it will respond to those events (Holzmann and Lieberman 1991). For example, most Internet protocols standardize the protocol states and their transitions in Request for Comments (RFC) documents, by describing them using prose in natural language or, in few cases, using finite state machines. Covering protocol states is a prerequisite for deeper code coverage of a protocol implementation, as some of its parts are only executed when the protocol reaches specific states. Moreover, exploring the protocol state space can uncover unintended or spurious behaviors of the protocol implementation that deviate from the protocol specification (Poll et al. 2015).
As an optimization for speeding-up fuzzing, StateAFL can be configured to perform heavy-weight post-execution analysis of long-lived memory only when strictly needed. Fuzz inputs are normally processed without performing the post-execution analysis, to have a high fuzzing throughtput; when a fuzz input covers a new program path (i.e., increases the code coverage), it is processed again in order to the post-execution analysis. The analysis returns a sequence of states reached by the fuzz input, which is stored by the fuzzer. This information is used by the fuzzer to generate more fuzz inputs starting from each state in the sequence.
To assess the feasibility of the approach and to support reproducible experimentation, we implemented StateAFL and integrated it with ProFuzzBench, a public benchmark for network fuzzers (Natella and Pham 2021). The benchmark includes 13 open-source network servers (Table 2). These targets are quite diverse with respect to several aspects: they cover 10 network protocols that have been typical targets of previous fuzzing studies; they are implemented both in C and in C++; they include both TCP and UDP, and both binary and text protocols; they adopt a variety of APIs (e.g., send/recv vs. fwrite/fread for networking, pthreads vs. fork for multiprocessing). ProFuzzBench automates the setup and the execution of the target servers using Docker containers, in a reproducible way. Moreover, ProFuzzBench configures the servers according to the best practices for coverage-driven fuzzing. In particular, the targets are patched to disable sources of randomness (e.g., pseudo-random number generators) in order to have reproducible behavior (i.e., if the program is executed again with the same input, then the same execution path is covered), which is an implicit assumption for coverage-driven fuzzing techniques. The experiments adopt the seed inputs from the ProFuzzBench project, where both practitioners and researchers contributed with both benchmark targets and with seeds for these targets. These seeds reflect typical basic usage of the servers according to their experience. The seeds include correct authentication and passwords (otherwise, the fuzzer would waste significant time before getting access to the server), and other frequent commands for the protocol (e.g., for FTP, the seeds get the list of files on the server, create directories and move across them, etc.). Table 2 provides the number of unique commands in the seeds for each target server. We remark that the need to provide initial seeds for the target server is a problem for any greybox fuzzing approach, in terms of automation and ability to work out-of-the-box for new software to test. We leave this aspect out of the scope of this work. 153554b96e
https://www.lapsichenonmente.com/group/temppasvite/discussion/8aa7928c-741a-430c-b2d4-860f23781f25
https://www.96guitarstudio.com/forum/music-forum/ptc-pro-engineer-wildfire-40-m210-win32-win64