DeepState introduction
DeepState Introduction
DeepState is a framework which provides developers the possibility to perform symbolic unit testing on some particular C++ functions by writing test harnesses. For writing a test harness, Deepstate follows a similar approch to the one of Google tests 1. The common problem between most of the symbolic unit testing libraries is that you as developer have to know how to use other binary analysis tools. With Deepstate this problem is solved by provinding a common interface the test harness creation process.
Installation
The installation process is quite simple, you just need to follow the istruction provided in the Readme file in the Deepstate repository on GitHub 2. In my case after the installation of all the necessary dependencies, I downloaded the repository snapshot using the standard git clone
approach
git clone https://github.com/trailofbits/deepstate.git
and then started the compilation process with the auxiliary of Make tools
mkdir deepstate/build && cd deepstate/build
cmake ../
make
If the compilation process succeed the next step is the installation of the compiled files in the proper location in the host system:
sudo make install
Test Harness creation
Testing C++ code is not a straightforward task and it need to be accomplished in the correct way, otherwise it will lead to a failure both in the test and also in the source code. A Test Harness is a collection of software that allows to test a particular program by executing under some conditions.
The first step in test harness creation is to include the library and optionally set the namespace, in order to use all the functionalities provided by DeepState :
#include <deepstate/DeepState.hpp>
using namespace deepstate;
A Test Harness definition in DeepState starts with the TEST
macro. This macro takes two arguments as input:
- a unit name
- a test name
The body of the function that starts with TEST
will contain the following parts:
- symbols definition
- pre-conditions
- post-conditions
Let’s dive into this three main parts of a Test Harness creation procedure.
Symbols definition
The first part consists in the definition of the symbols that will be used as input for the function we need to test. The main goal for this task is to have a set of variables that will be filled by the DeepState with some non-deterministic data.
Symbolic execution tools and fuzzers needs to know which variable is symbolic in order to control them. Symbols definitions for basic data types (int
, char
, ..) are defined by concatenating the data type name with the symbolic_
prefix (symbolic_int
, symbolic_char
, ..).
DeepState provides us some useful functions to initialize our symbols with random data:
int DeepState_IntInRange(int low, int high)
: initializes the symbol with an integer in the range low-highfloat DeepState_FloatInRange(float low, float high)
: initializes the symbol with a float number in the range low-highdouble DeepState_DoubleInRange(double low, double high)
: initializes the symbol with a double number in the range low-highchar* DeepState_CStr_C(size_t len, const char* allowed)
: initializes a character array (a string) of lengthlen
with a set of allowed character taken from the character array `allowedchar* DeepState_CStrUpToLen(size_t maxLen, const char* allowed)
: initializes a character array (a string) of length up tolen
with a set of allowed character taken from the character arrayallowed
Pre conditions
Sometimes there is the necessity to constraint a little bit more the symbolic variables defined previously. This means that you have some particular constraint that your symbolic variable need to comply with, before running into the tests. All the tests that do not pass the pre conditions will be considered abandoned
.
Pre conditions are defined using the ASSUME
macro and also by the following more specialized ones:
ASSUME_EQ
: checks for the equality of the 2 parameters (operator ==)ASSUME_NE
: checks if the 2 parameters are not equal (operator !=)ASSUME_LT
: checks if the first parameter is less than the second (operator <)ASSUME_LE
: checks if the first parameter is less or equal to the second (operator <=)ASSUME_GT
: checks if the first parameter is greater than the second (operator >)ASSUME_GE
: checks if the first parameter is greater or equal to the second (operator >=)
Example
Suppose you have to pass to your test only strings with a minimum length of 5 characters and up to 10 characters (both inclusive). This types of strings can not be created by only using DeepState_CStrUpToLen
. We have to constraint the fact that the length of the string should also be greater then 5. This may be accomplished by using a precondition that checks if the created string meets the requirement that its length be larger than or equal to 5. With the preceding list in mind, this can be implemented with the ASSUME_GE
macro in the following way:
TEST(UnitName, TestName) {
char* str = DeepState_CStrUpToLen(50, "abcdefABCDEF");
ASSUME_GE(strlen(str), 5);
...
}
Post conditions
The last part is the check of the execution result of a test. This can be done with post conditions that checks if the results satisfy some constraints. Similar to the pre conditions, the post condition defines a list of macros that help us to perform this checks. This time a post condition is defined using the ASSERT
macro and all it’s specialized versions (ASSERT_EQ
, ASSERT_NE
, ASSERT_FALSE
, …).
Example
Assume that we want to test a function that compress a string as input into a fixed length string : an hash function like MD5. A possible post condition will check if the hashed string is of a certain length (in the case of md5, exactly 32 characters).
TEST(UnitName, TestName) {
char* str = DeepState_CStrUpToLen(50, "abcdefABCDEF");
// Pre conditions
ASSUME_GE(strlen(str), 5);
// execution
char* hashedString = md5(str);
// post conditions
ASSERT_EQ(strlen(hashedString), 32);
}
Test harness compilation
You can compile the test harness using a standard C++ compiler like g++ by specifying the linker to include the deepstate header with the -ldeepstate
parameter
g++ -ldeepstate -o harness harness.cpp
The output of this procedure is a compiled harness file that can be executed.
Fuzz test execution
The compiled test harness can be executed using some analysis tools like manticore
, angr
or valgrind
to discover memory leaks in the code. You can also run it in a standalone mode by executing it directly in the console with the --fuzz
option. This last option enables DeepState to perform brute force fuzzing. Another useful option is --timeout
that allows to specify a timeout(in seconds) after which the fuzzing procedure will be stopped. In addition with the --input_which_test
you can specify the test to execute in the format UnitName_TestName
.
./harness --fuzz --timeout=5 --input_which_test UnitName_TestName
This execution will give us the total number of failed, passed and abandoned tests
Advanced fuzz testing
I’ll go through how to use DeepState in conjunction with more complex tools like Valgrind in the next blog posts.