Monday, December 5, 2016

[1day] [PoC with $rip] Deterministic Linux heap grooming with huge allocations

In a previous blog post, I disclosed CESA-2016-0002, an 0day vulnerability (without exploit) in the vmnc decoder of the gstreamer media subsystem, which is installed by default in Fedora.

Because a Fedora fix was somewhat slow in coming, I decided to attempt to exploit this vulnerability. This would have to be another scriptless vulnerability. My previous scriptless exploit against the FLIC decoder showed that these can be tricky, at least for me.

TL;DR: I failed to get a full exploit going before Fedora issued a fix. At the time of writing, my Fedora 25 install just received gstreamer1-plugins-bad-free-1.10.1-1.fc25, which appears to fix the bug. However, Fedora 24 appears to remain unpatched.
Before stopping I did find another instance of a Linux allocator quirk that I think needs to be properly documented, discussed and fixed.

Recap of exploitation primitives
You can refer to the original post for a fuller description, but essentially, the vulnerability is an integer overflow in canvas allocation, leading to decoder commands operating on out of bounds memory. Because one of the decoder commands is “copy within canvas”, we have a very powerful exploitation primitive -- we can set both the source and the destination of the copy to be out of bounds, so we can start resolving ASLR by copying pointers around.

The main challenge in proceeding with exploitation is heap layout. If you run my original PoC vmnc_width_height_int_oflow.avi, you’ll get a crash with mappings something like this:

555555757000-55555645b000 rw-p 00000000 00:00 0 [heap]
7fffa0000000-7fffa006e000 rw-p 00000000 00:00 0
7fffa006e000-7fffa4000000 ---p 00000000 00:00 0
7fffa4000000-7fffa4022000 rw-p 00000000 00:00 0
7fffa4022000-7fffa8000000 ---p 00000000 00:00 0

The canvas dimensions for the video are 0xffff x 0x8001 x 16bpp, giving an allocation size of 65534 bytes. The crashing dereference address for the bad write off the end of the canvas is 0x7fffa40237fe. The corresponding mapping is highlighted with bold above. The immediate problem is there’s not too much of immediate interest inside the affected thread arena. The decoder metadata object -- often a very interesting target for corruption -- is in the previous thread arena (the one of size 0x6e000). On 64-bit, we don’t have enough range in our heap corruption primitive to “wrap around” the address space and target that. On 32-bit, this is likely feasible. But we’re going after 64-bit today.

Sure, there’s a bunch of pointers inside the affected thread arena, but going after any of them with a scriptless attack is likely going to be a headache. And it may require heap grooming. Today, we decline to proceed here.

Linux heap behavior to the rescue!
The way we at least start to try advancing reliable exploitation is by abusing deterministic behavior for huge allocations in the Linux glibc allocator.

By default, the glibc allocator will fall back to using mmap() to allocate very large allocations, and do so for some fairly large number of mappings if necessary. The parameters here are tunable but on 64-bit Linux, typically up to 65536 allocations will be allowed via mmap() and anything >= 128kB will use mmap().

Our integer overflow primitive is a straightforward 16-bit width x 16-bit height overflow, so it’s fairly easy to pick some values that when multiplied together result in an integer overflow but still a large allocation size.

So when glibc calls mmap() to service a large allocation, what happens? The code is glibc/malloc/malloc.c, sysmalloc(), with nb being the number of bytes requested:

#define MMAP(addr, size, prot, flags) \
__mmap((addr), (size), (prot), (flags)|MAP_ANONYMOUS|MAP_PRIVATE, -1, 0)
 if (av == NULL
     || ((unsigned long) (nb) >= (unsigned long) (mp_.mmap_threshold)
         && (mp_.n_mmaps < mp_.n_mmaps_max)))
     size = ALIGN_UP (nb + SIZE_SZ, pagesize);
     /* Don't try if size wraps around 0 */
     if ((unsigned long) (size) > (unsigned long) (nb))
         mm = (char *) (MMAP (0, size, PROT_READ | PROT_WRITE, 0));

Simple enough, and there’s even care to avoid integer overflow :-) Of interest is the first parameter to mmap(), the address parameter, which is passed as NULL. This is telling the kernel: “figure out a suitable address yourself”.

So how does the kernel decide where to put a mapping request? There are a few corner cases and complexities but for the cases we care about, we can look at the kernel x86_64 architecture specific default handling in arch_get_unmapped_area_topdown(). The algorithm is fairly simple: it picks the first address where the requested size fits, starting at the “mmap base” and working downwards in virtual address space. The “mmap base” is some random gap below the main process initial stack.

There are typically a few holes in the top down addresses space scan but if we cause a large allocation, we can make sure those holes are too small to fit, and that our allocation only fits below the recently allocated thread heap arena. Heap arenas are 64MB on 64-bit, and the way they are allocated can often leave huge 64MB address space holes between them. So a 128MB allocation should be nearly guaranteed to be placed just before the most recently allocated thread area.

Some tests and some possible exploit paths
Let’s now try a 16bpp (64k colors) file with width == 0xffff and height == 0x8400: vmnc_fault_with_large_alloc.avi. We cause an integer overflow and a 134150144 byte (~128MB) allocation and the mappings will look like this:

555555757000-55555645b000 rw-p 00000000 00:00 0 [heap]
7fff98010000-7fffa0000000 rw-p 00000000 00:00 0
7fffa0000000-7fffa006e000 rw-p 00000000 00:00 0
7fffa006e000-7fffa4000000 ---p 00000000 00:00 0
7fffa4000000-7fffa4022000 rw-p 00000000 00:00 0
7fffa4022000-7fffa8000000 ---p 00000000 00:00 0

Very useful! Our 128MB allocation -- highlighted in bold above -- is packed right up against a thread arena. It is also the thread arena that contains the decoder metadata, so one attack is to go after this. Let’s do that. The metadata object is defined like this:

typedef struct
 GstVideoDecoder parent;

 gboolean have_format;

 GstVideoCodecState *input_state;

 int framerate_num;
 int framerate_denom;

 struct Cursor cursor;
 struct RFBFormat format;
 guint8 *imagedata;
} GstVMncDec;
struct Cursor
 enum CursorType type;
 int visible;
 int x;
 int y;
 int width;
 int height;
 int hot_x;
 int hot_y;
 guint8 *cursordata;
 guint8 *cursormask;

There are possibilities here. The most obvious is to copy a valid pointer to a more interesting object on top of the imagedata value, which is the canvas pointer relative to which we can corrupt. The following demos apply to Fedora 25 with the v1.10.0-1 RPM versions of the various gstreamer1 packages.

Demo 1: $rip == 0x414141414141
Demo file: vmnc_rip_414141414141.avi. This crashes as noted when run in totem under gdb. It works because the GstVMncDec decoder object is consistently allocated at offset 0xb840 into the thread arena directly after our massive allocation. Therefore, we can use constant offsets in our PoC file to:
  1. Copy GstVMncDec::parent::srcpad on top of GstVMncDec::imagedata, causing the next canvas write to be relative to a GstPad object. (Note that the GstVMncDec object and the GstPad object are in different thread heap arenas, and the address delta between the arenas is not consistent, so this is a powerful primitive.)
  2. Write 0x414141414141 on top of GstPad::finalize_hook, a function pointer that will be called later.

In the world of scriptless exploits, pointing the instruction pointer to a known static constant might look impressive, but it’s worlds away from a successful exploit. Accordingly, to prove we’ve got just a little more control than that:

Demo 2: $rip == 0x7fffa400bdf0
Demo file: vmnc_rip_is_heap.avi. This crashes similarly to as noted. This is demonstrating that our powerful copy primitive can, to an extent, resolve ASLR. In this instance, we’ve copied a heap pointer on top of a function pointer to show the level of control we have. This crashes like this:

Thread 18 "multiqueue0:src" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffb271d700 (LWP 3336)]
0x00007fffa400bdf0 in ?? ()
(gdb) p $rip
$10 = (void (*)()) 0x7fffa400bdf0
(gdb) x/1s $rip
0x7fffa400bdf0: "src"

We’ve pointed the instruction pointer to a heap chunk that contains a string. The crash is because my processor supports a non-execute bit :-) To proceed with an exploit, we’d need to choose a different path, but we’ve demonstrated a certain level of control beyond blindly nuking a function pointer.

Unfortunately, a reliable exploit may not be possible with this path in general. Although the GstVMncDec object is reliably placed at offset 0xb840 in its arena when the exploit is run under gdb, the arena layout jiggles around a little bit from run to run when run normally. The reason has not been investigated.

Demo 3: reliable crash in malloc_consolidate with $rbx == 0x41414141
Demo file: vmnc_malloc_consolidate_41414141.avi. In order to try and get a more reliable start to my exploit, I decided to target malloc arena metadata. This can be done very reliably because it occurs right at the beginning of an arena’s mapping. It is not subject to heap jiggle!

Thread 18 "task2" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffb271d700 (LWP 3599)]
0x00007fffefc13008 in malloc_consolidate () from /lib64/
(gdb) i r
rbx            0x41414141       1094795585
(gdb) disass $rip-20,$rip+20
Dump of assembler code from 0x7fffefc12ff4 to 0x7fffefc1301c:
=> 0x00007fffefc13008 : mov    0x8(%rbx),%rax

The above effect is achieved by writing a pointer value on top of a malloc() bin pointer. When this bin is touched, a deterministic crash is achieved. The reliability here is strong, but I didn’t get far along with an exploit. The challenge is to find a primitive that will follow pointer chains. In order to “break out” of the arenas to something more interesting, it is necessary to follow the linked list of arena pointers until you find main_area, inside glibc. The likely path forward to do this would be to iteratively copy the glibc malloc_state->next pointer on top of something else, such as one of the bin pointers, or malloc_state->top, and then abusing side effects from malloc() and free() calls made in the decode loop. Proceeding in this manner will require evading glibc’s various internal corruption checks, but we have the ability to edit memory structures and copy pointers around, so it is well within the bounds of possibility.

Closing notes
This is not the first time that highly deterministic Linux mmap() behavior has been taken advantage of. In fact, just last week, Google Project Zero published a wonderful exploit against Android’s shared memory handling. Amongst other tricks, the deterministic behavior of mmap() placement was abused in order to get a favorable virtual memory layout. What is interesting is that this was on the 64-bit ARM architecture. Whereas an argument could be made that 32-bit address space is so limited that fragmentation is a concern preventing stronger randomization, 64-bit address spaces provide an opportunity to place mappings a little less predictably.

On platforms with decent sized user address spaces (x86_64, 47 bits and aarch64, 39 bits), I think it’s time to randomize unhinted mapping requests. There are concerns to talk through such as fragmentation, page table memory bloat and TLB impact. However, some significant software already implements virtual address mapping randomization in user space. This includes Adobe Flash, as well as Google Chrome’s main allocator, PartitionAlloc. The concept is proven.

Accordingly, paging Kees Cook of the kernel hardening project…. :-)

1 comment:

Grazfather said...

If you can provide dimensions that result in a small allocation but allow unbounded writes (is that how you're overwriting the malloc metadata?) you should be able to change the size of the top chunk so that it's gigantic, and so that a subsequent, also controlled allocation soaks up this space and leaves the top in a desired area (e.g. GOT being an easy target if there's no full RELRO). A third allocation will now return an address in GOT (or wherever). If you can write here you can force calls to wherever. What I'm describing here is just House of Force as described in Phrack: