Qiuchen Yan's Homepage

Loop Summarization

Since Sep, 2015

Path explosion is one of the most challenging issues of symbolic execution, and loops can cause this problem very often. To mitigate this problem, previous work has introduces various algorithms to generate a summarization of a loop instead of executing it ever time. Among those algorithms we chose SAGE’s trace based loop summarizaion algorithm and implement a execution based version of it on FuzzBALL. This project is supported by a grant under DARPA CGC program.

(read more...)

Motivation

Despite that the loop summarization code was turned off in the actual competition, this project is supposed to be part of the FuzzBOMB, an automatic vulnerability detecting & repairing system based on AI and symbolic execution. To trigger certain types of bugs, adversaries use loops very often. e.g. keep writing to an array to trigger buffer overflow. Consequently, a bug finding tool cannot detect those vulnerabilities unless it can analyze loops. However, FuzzBOMB relies on symbolic execution to detect vulnerabilities, while it is challenging for a symbolic execution tool to analyze loops since doing that can cause path explosion. In order to detect bugs hidden in loops, we would like to mitigate the path explosion issue of FuzzBALL, the symbolic execution engine part of FuzzBOMB.

What’s New

We mostly follow the approach described in Autonatic Partial Loop Summarization in Dynamic Test Generation while also adjust the algorithm according to the difference between SAGE and FuzzBALL. SAGE is a trace based symbolic execution engine, while FuzzBALL generate symbolic expressions while executing the binary. Therefore, the algorithm to build dynamic CFGs and detect loops still works, but we need to update CFGs as we execute new code.

In addition, we generate multiple summarizations for branching loops. A branching loop is a loop with if statement or other types of branches in it. When we enter a loop for the first time, we create the first summarization no mater whether it is a branching loop. Then we decide whether to repeat this loop by a heuristic in statically-directed dynamic automated test generation, and attempt to find more branches if exist. At the beginning of each iteration, we check whether the existing summarization applies this time by evaluating the pre-condition. If the existing summarization doesn’t work, we then generate another summarization.

We have performed preliminary evaluation of the loop summarization algorithm using CGC competition binaries (CBs). To begin with, we run each CB with its POVs (Proof of Vulnerability,) so that we can take the path that is guaranteed to trigger a vulnerability. Under this condition, if our tool can raise an alarm for the vulnerability, then we can conclude that our tool can detect this vulnerability as long as we run it for long enough time. There is another challenge in this evaluation: our tool can raise a variety of alarms, but it is not clear whether the indicated vulnerability is the one triggered by the POV, and whether loop summarization is helpful for this CB. Consider the number of CBs, we try to study some of those results automatically. For example, if an unsafe memory is accessed while executing a loop only if loop summarization is turned on, then loop summarization is helpful for the analysis of this CB.

Ongoing work

Since the DARPA CGC has ended, we would like to do more evaluation with multi-OS CBs, the ported version of original CBs that can run on Linux. Currently we are working on porting loop summarization code to the latest version of FuzzBALL, and rerun the experiment with POVs. We are planning more evaluation and more detailed analysis to the current experimental results.

In the long run, we also would like to combine Veritesting with loop summarization when we have a reliable implementation of Veritesting on FuzzBALL.

Past Projects

Fast PokeEMU

Since Sep, 2016

PokeEMU is a automatic emulator testing tool with high coverage, while it is less practical considering the hundreds of CPU hours it takes for one full test. To improve PokeEMU, we explore techniques for combining many tests into one program to amortize overheads such as booting an emulator (aggregating), and reuse each test repeatly with random inputs (looping). To ensure the results of each test are reflected in a final result, we use the outputs of one instruction test as an input to the next, and adopt the Feistel network construction from cryptography so that each step is invertible. A paper of this work has been accepted by VEE’18.

(read more...)

Motivation

CPU emulators are widely used in various fields. Developing an emulator is challenging because processors are complicated, and the emulator must follow all the architecture specification for software compatibility. Since developers may change the emulator quite often (231 commits to the X86 translator per year on average), an effective automatic testing is desirable. For high coverage test we need to run a large number of test cases. Therefore, it is important to execute each test efficiently if we want to finish the full test in a reasonable amount of time.

PokeEMU is a emulator testing framework that can detect bugs by comparing the behaviors of the tested emulator with KVM, which run most instructions using host hardware. It currently only support Bochs and QEMU with X86-32bit target, but the same approach can be applied to other emulators with additional engineering efforts. It generate tests by exploring Bochs with symbolic execution, and then run those tests on both OEMU and KVM, dump the final machine state in a memory dump and compare them. This test involves 76510 test cases, and it takes approximately 150 CPU hours to finish the full test. Among the 150 hours, it spend most time booting QEMU & making memory dump (⁵²⁹⁄₅₈₃ ms according to our measurement.) To improve the performance this PokeEMU, we would like to minimize this part of time.

Approach

Aggregation

The general idea is that, instead of starting QEMU, run only ONE test and dump the machine state, we run a large number of tests. With this change, we only start QEMU and run tests for 1078 times, much less than the previous 76510 times. The simplest implementation of this idea would be just run a group of test cases on after another and compare the final machine state. One problem of this simple approach is that outputs of one test case can be overwritten by following test cases. As a workaround, we copy the output to unused memory before the next test case. In practical, we group tests by the instruction it tests, and aggregate each group.

Looping

Another way to increase the efficiency of PokeEMU is to reuse test case code. If we re-run each test for multiple times with different inputs, the total time for running a full test almost doesn’t change, while the coverage of PokeEMU may further increase: When generating test cases using symbolic execution, we only set a essential subset of the whole machine state to symbolic (otherwise the symbolic execution phase will take forever.) Therefore we probably can further increase the coverage by running each test case for multiple times with different random inputs e.q. a fuzz testing.

Reusing memory space with the Feistel Construction

In PokeEMU, QEMU runs with only 4 MBs memory to save time, while still be able to explore all 4GB memory address of 32-bit CPU. This design has became a problem when we combine aggregation and looping. A test case is usually 100+ bytes without Feistel and 200+ bytes with Feistel. And each time we rerun a test, it occupies another 12 - 30 bytes of memory space. The later is the real problem, since it can keep eating up memory space as we increase the number of times we repeat each test.

To save space storing the outputs, we would like to find a way to compress those outputs, while avoid losing data. This is similar to the requirement of block ciphers. Therefore, we integrate the Feistel construction with the execution of tests.

Since the Feistel construction is a bijection no matter whether round functions are invertible, this structure can guarantee that two cipher text with the same input will be different if there is only one different round function and everything else are the same. The probability that 2 cipher text equals increase as more round function differences, but still low if there is only a small number of behavior differences.

Evaluation

Performance

Mode	Total time (s)	Time per test (ms)
Separate	84871.8	583.528
Simple	334.7	2.313
Feistel	345.0	2.448
Loop (1)	345.17	2.672
Loop (10000)	1635.45	0.002

Effectiveness

In this experiment we compared the testing results of vanilla PokeEMU (column 1), Fast PokeEMU without aggregation (column 2) and Fast PokeEMU (column 3.) Most time the results of all three cases should be the same if there is no bug in Fast PokeEMU, but Fast PokeEMU may detect additional bugs.

Separated result	Separated result with extra code	Aggregated result	# of instructions
Match	Match	Match	577
Mismatch	Mismatch	Mismatch	273
Match	Match	Mismatch	8
Match	Mismatch	Match	10
Match	Mismatch	Mismatch	28
Mismatch	Match	Match	25
Mismatch	Match	Mismatch	28
Mismatch	Mismatch	Match	9

Historical bug experiment

Previous experiment doesn’t evaluate the effect of looping. In this experiment, We try to figure out whether Fast PokeEMU can find more real-world bugs with looping turned on. Since it requires additional effort to add (Fast) PokeEMU support to QEMU, currently we run this experiment on QEMU version 1.0 through 2.4. Bellow is a list of historical bugs that can be detected by vanilla and Fast PokeEMU, and are fixed before version 2.4.

Fix	Instruction	PokeEMU	Fast PokeEMU
321c535	BSF_GdEdR	*	*
	BSR_GdEdR	*	*
dc1823c	BTR_EdGdM	*	*
	BTR_EdGdR		*
	BTR_EdIbR		*
	BTC_EdGdR		*
	BTC_EdIbR		*
	BT_EdGdR		*
	BT_EdIbR		*
	BTS_EdGdR		*
	BTS_EdIbR		*
5c73b75	MOV_CdRd	*	*
	MOV_DdRd	*	*
	MOV_RdCd	*	*
	MOV_RdDd	*	*

Future work

We are implementing task switching for exception handling, which can significantly increase the number of valid tests that didn’t work on Fast PokeEMU. One major reason of errors in the effectiveness results is bug in Fast PokeEMU, and we would like to improve our tool by fixing those bugs. In addition, we also would perform the historical bug experiment on a larger range of QEMU versions.

Type Inference

Since Jun, 2014

Recovering variable types or other structural information from binaries is useful for reverse engineering in security, and to facilitate other kinds of analysis on binaries. In this project, we statically infer the signedess of variables using a graph-based algorithm and heuristics about variable types. A technical report for this project is available.

(read more...)

Approaches

Minimum Cut

The core of our approach is to infer the signedness using minimum cut algorithm. Imagine that we have a graph, in which each node is a variable, and we add an edge between node A to node B if variable A may have the same signedness as variable B (e.g. A and B are two operands of the same instruction.) If we split such a graph to two parts, one involves all signed variables and another involves all unsigned ones, the edges we cut are where signedness casting happen. And since developers prefer source code with a minimum number of casts, we would like to compute a minimum cut between signed and unsigned variables, which corresponds to a minimal set of casts required for a legal typing.

Signedness Instructions

A graph can have multiple sets of minimum cuts if we don’t have any other limitating factors. To find the most accurate one, we would like to infer the signedness of as many variables as possible before we cut the graph. We perform the first round of signedness inference based on heuristics about signedness instructions/operations.

A signedness instruction (operation) is an instruction that can reveal the signedness of its operands. Bellow is a list of signedness instructions we collected.

When performing conditional jump, signed variables use jg and jl, while unsigned variables use ja and jb.
A signed variable is right shifted using arithmetic right shift.
Variable using different modulo and divide instructions according to their signedness.

Using those signedness instructions, we can identify variables that are obviously signed/unsigned, and only compute minimum cuts between the of known signed group and known unsigned group.

Implementation Details

As mentioned above, we perform static analysis to infer signedness. To begin with, we disassemble binaries, and translate X86 assembly instrcutions to Vine IR. We than build a graph for each function of the binary, and perform minimum cut algorithm on it.

Since our goal is only to infer the type of variables, we simplify the data structure inference by using knowledges in debugging information directly. For this purpose, all the binaries to analyze are compiled with -g option on, and we parse debugging information using libdwarf. With the debugging information, we can associate each variable described in C to a location described in X86 assembly.

In practical, not only variables but also registers and memory locations are added to the graph as nodes. Furthermore, since the same location can be either signed or unsigned at different time, In addition, we also applies static single assignment (SSA) in our analysis, which requires building a CFG for the analyzed binary.

Evaluation

We evaluate this algorithm by erasing signedness information from debugging symbols, and testing how well our tool can recover it. Applying an intra-procedural version of the algorithm to the GNU Coreutils, we observe that many variables are unconstrained as to signedness, but that it almost all cases our tool recovers either the type from the original source, or a type that yields the same program behavior. Different signedness can compile to the same binary or different binaries with the same behavior

Current Projects

FuzzBALL QEMU Testing

Loop Summarization

Past Projects

Fast PokeEMU

Type Inference

Publication

	Qiuchen Yan PhD Student Department of Computer Science and Engineering 4-225A 200 Union St. SE. University of Minnesota (Twin Cities) Minneapolis, MN, 55455 yanxx297@umn.edu
	CV