Purdue University Graduate School
Browse

FUZZING HARD-TO-COVER CODE

Download (2.14 MB)
thesis
posted on 2021-05-06, 13:26 authored by Hui PengHui Peng
Fuzzing is a simple yet effect approach to discover bugs by repeatedly testing the target system using randomly generated inputs. In this thesis, we identify several limitations in state-of-the-art fuzzing techniques: (1) the coverage wall issue , fuzzer-generated inputs cannot bypass complex sanity checks in the target programs and are unable to cover code paths protected by such checks; (2) inability to adapt to interfaces to inject fuzzer-generated inputs, one important example of such interface is the software/hardware interface between drivers and their devices; (3) dependency on code coverage feedback, this dependency makes it hard to apply fuzzing to targets where code coverage collection is challenging (due to proprietary components or special software design).

To address the coverage wall issue, we propose T-Fuzz, a novel approach to overcome the issue from a different angle: by removing sanity checks in the target program. T-Fuzz leverages a coverage-guided fuzzer to generate inputs. Whenever the coverage wall is reached, a light-weight, dynamic tracing based technique detects the input checks that the fuzzer-generated inputs fail. These checks are then removed from the target program. Fuzzing then continues on the transformed program, allowing the code protected by the removed checks to be triggered and potential bugs discovered. Fuzzing transformed programs to find bugs poses two challenges: (1) removal of checks leads to over-approximation and false positives, and (2) even for true bugs, the crashing input on the transformed program may not trigger the bug in the original program. As an auxiliary post-processing step, T-Fuzz leverages a symbolic execution-based approach to filter out false positives and reproduce true bugs in the original program.

By transforming the program as well as mutating the input, T-Fuzz covers more code and finds more true bugs than any existing technique. We have evaluated T-Fuzz on the DARPA Cyber Grand Challenge dataset, LAVA-M dataset and 4 real-world programs (pngfix, tiffinfo, magick and pdftohtml). For the CGC dataset, T-Fuzz finds bugs in 166 binaries, Driller in 121, and AFL in 105. In addition, we found 4 new bugs in previously-fuzzed programs and libraries.

To address the inability to adapt to inferfaces, we propose USBFuzz. We target the USB interface, fuzzing the software/hardware barrier. USBFuzz uses device emulation
to inject fuzzer-generated input to drivers under test, and applies coverage-guided fuzzing to device drivers if code coverage collection is supported from the kernel. In its core, USBFuzz emulates an special USB device that provides data to the device driver (when it performs IO operations). This allows us to fuzz the input space of drivers from the device’s perspective, an angle that is difficult to achieve with real hardware. USBFuzz discovered 53 bugs in Linux (out of which 37 are new, and 36 are memory bugs of high security impact, potentially allowing arbitrary read or write in the kernel address space), one bug in FreeBSD, four bugs (resulting in Blue Screens of Death) in Windows and three bugs (two causing an unplanned restart, one freezing the system) in MacOS.

To break the dependency on code coverage feedback, we propose WebGLFuzzer. To fuzz the WebGL interface (a set of JavaScript APIs in browsers allowing high performance graphics rendering taking advantage of GPU acceleration on the device), where code coverage collection is challenging, we introduce WebGLFuzzer, which internally uses a log guided fuzzing technique. WebGLFuzzer is not dependent on code coverage feedback, but instead, makes use of the log messages emitted by browsers to guide its input mutation. Compared with coverage guided fuzzing, our log guided fuzzing technique is able to perform more meaningful mutation under the guidance of the log message. To this end, WebGLFuzzer uses static analysis to identify which argument to mutate or which API call to insert to the current program to fix the internal WebGL program state given a log message emitted by the browser. WebGLFuzzer is under evaluation and so far, it has found 6 bugs, one of which is able to freeze the X-Server.

History

Degree Type

  • Doctor of Philosophy

Department

  • Computer Science

Campus location

  • West Lafayette

Advisor/Supervisor/Committee Chair

Mathias Payer

Advisor/Supervisor/Committee co-chair

Dave (Jing) Tian

Additional Committee Member 2

Ninghui Li

Additional Committee Member 3

Dongyan Xu

Additional Committee Member 4

Pedro Fonseca