As a defense, information hiding is more efficient than integrity-based defenses. In particular, randomization is almost ‘free’, as even a sophisticated defense against code reuse attacks such as Code Pointer Integrity adds a modest 2.9% performance overhead.
Unfortunately, recent research demonstrates that attackers bypass even the most advanced information-hiding defenses. They show that, by repeatedly
probing the address space (either directly or by means of side channels), it is possible to break the underlying randomization and reveal the sensitive data.
In this paper, we show that we can transforming from fast information hiding to strong software integrity if (and only when) attacks start probing to break the randomization.
Derandomization primitives. To break randomization, attackers make use a number of derandomization primitives. Since one-shot leaks are rare in modern defenses—as the defenses move all sensitive information (e.g., code pointers) out of reach of the attacker, state-of-art derandomization primitives invariably must probe by repeatedly executing an operation to exhaust the entropy.
Selective hardening. The key idea we present is that, in a software protected by a fast baseline defense (information hiding), we keep monitoring the running program for any occurrence of probing attempts. When we encounter any such attempt, we automatically locate it origin, and patch only the offending piece of code at runtime with stronger and more expensive integrity-based defenses.
The first stage of ProbeGuard is form of anomaly detection. We detect probing attempts that characterize derandomization primitives. For most varieties of probing attacks, the anomaly detection itself is simple and non-intrusive (e.g., detecting of repeated exceptions).
The second stage, namely probe analysis, uncovers the particular code site the attacker abused for probing. Doing so is complicated in the general case. However, by leveraging fast control-flow tracing feature (e.g., Intel Processor Trace), ProbeGuard conservatively pinpoints the offending code fragment in a secure way.
Finally, ProbeGuard hotpatch the program by selectively replacing the offending code fragment with a harden variant. In principle, ProbeGuard is agnostic to the hotpatching technique it itself. A simple and elegent way is to create a binary that already contains multiple versions of all code fragments, where each version offers different levels of protection.
We consider a determined remote attacker who aims to mount a code reuse attack over the network on a server application hardened by any ideal state-of-art information hiding-base defenses. ProbeGuard’s goal is to address the fundamental weakness of practical (information-hiding based) code reuse defenses, making them resistant to attacks that bypass the defense by derandomizing hidden memory regions.
We trust the underlying operating system, and we assume a modern processor that provides efficient control flow tracing, such as Intel Processor Trace.
We assume a determined attacker who has access to derandomization primitives to probe the victim’s address space. Further, we assume that the attacker has unlimited probing attempts as the application recovers automatically upon any crash.
An application employs information-hiding based on state-of-art defenses. ProbeGuard must ensure what is hidden remain hidden. We embed anomaly detectors within the application that sense probing attacks and a secure code cache consisting of a collection of code fragments hardened by applying LLVM-based integrity checking instrumentations. A separate reactive defense server decodes execution traces obtained by Intel PT and performs fast probe analyses. ProbeGuard then reactively activates hardened code fragments by hotpatching when under attack.
Arbitrary reads and writes. An attacker may exploit an arbitrary memory read or write vulnerability in the application with the goal of derandomization the hidden region. Typically, only a very small fraction of the application’s virtual address is actually mapped. When the attacker uses such a vulnerability to access random address, it is highly likely to hit an unmapped virtual memory address leading to a segmentation fault (or a crash). We detect such probing attacks by simply handling and proxying the signal using a custom
Kernel reads and writes. Attackers prefer probing silently and avoid detection. Hence, to avoid the crashes, they could also attempt to derandomize the victim application’s address space by probing memory via the kernel. Certain system calls (e.g.,
read) accepts memory addresses in their argument list and return specific error codes (e.g.,
EFAULT) if the argument is a pointer to an inaccessible or unallocated memory location. Using arbitrary-read/write primitives on such arguments, they could attempt CROP attacks to enable probes eliminating application crashes (thereby not generating
SIGSEGV signals). We can detect such probing attacks by intercepting system calls, either in
glibc or directly in the kernel, and inspecting their results.
Arbitrary jumps. Attacker can use this primitive to scanning the address space, and looking for valid code pointers, and then locating gadgets. An attempt to execute unmapped or non-executable memory results in either a segmentation fault (raising a
SIGSEGV signal) or an illegal instruction exception (raising a
SIGILL signal). Thus, we extend our custom signal handler to handle both the signals.
Allocation oracles. Such probes exploit memory allocation functions in the target application by attempting to allocate large memory areas. Success or failure of the allocation leaks information about the sizeof holes in the address space, which in turn, helps locate the hidden region. We hook into
glibc to intercept the system calls used to allocate memory (e.g.,
brk). We choose a configurable threshold on allocation size, above which our detector triggers reactive hardening (half of the address space by default).
Upon an anomaly detector flagging a potential attack, ProbeGuard must locate the offending code fragment. To locate the probing primitive, we employ hardware-assisted branch tracing to fetch to control flow prior to when we detectd the anomaly. We build a reverse mapping to fetch source-level information from the trace.
We obtain past executed control-flow using Intel PT, which offers low-overhead and secure branch tracing. Control bits in the CPU’s model-specific registers (MSRs) allow an operating system kernel to turn this hardware feature to on or off. The buffer size is configurable, typical values range from 2 MB to 4 MB or more. The backward trace analysis can limit itself to the relevant recent control-flow history and avoid decoding all of the trace in its entirely.
perf record command interface’s snapshot mode and dump the trace when required. Although the decoded trace provided, it’s hard to mapping them back to the source file and determining the offending code fragment.
The probe analyzer must locate the affected spot in the source code. We repurpose a field in LLVM’s debug metadata that normally carries
column number of the source code location to instead place respective basic block identifiers. This only simplifies out prototype implementation to let LLVM’s default code generator pass on the metadata through DWARF 4.0 symbols onto the resulting application binarym instead of having to use a new metadata stream and write the supporting code.
We choose to mark the entire parent function that includes the probing primitive and use this for hardening.
To facilitate hotpatching, we first transform the program using our LLVM compiler passes. The goal is to be able to quickly and efficiently replace each vanilla variant of a function with a different (hardened) variant of the same function at runtime. We clone all functions found in the target application’s LLVM IR and selectively invoke security-hardening instrumentation passes on specific functions clones at compiler time.
switchboard (which we insert in the application) allows switching between each function variant at runtime. It contains an entry for each function in the program, controlling which of the variants to use during execution.
To deter attacks against ProbeGuard, we mark the switchboard as read-only during normal execution.
Arbitrary memory reads and writes. Software Fault Isolation mitigates probing attempts that use arbitrary reads and writes. It simply instruments every load or store operation in the application binary by masking the target memory location with a bit mask (47th bit of the memory pointer).
Kernel reads and writes. We mask all pointer arguments to library calls.
Arbitrary jumps. We implemented a type-based CFI policy for forward-edge protection, and a per-thread shadow stack for backward-edge protection.
Allocation oracles. A white-list based threshold on the size arguments of library functions.
ProbeGuard’s implementation consists of the following:
A static library linked with the application: it houses a signal handler registered at startup, and helps in hotpatching to support switching between function variants at runtime.
glibcmodifications to intercept
mmap(), and syscalls that results in
LLVM compiler passes to generate and propagate function identifying markers onto the binary via DWARF 4.0 symbols, and function cloning to facilitate hotpatching.
A seprate reactive defense server that does probe analysis by fetching Intel PT traces using
libiptto map them onto the binary by reading the markers using