Monday, November 2, 2009

A new fuzz frontier: packet boundaries

Recently, I've been getting pretty behind on executing my various research ideas. The only sane thing to do is blog the idea in case someone else wants to run with it and pwn up a bunch of stuff.

The general concept I'd like to see explored is perhaps best explained with a couple of concrete bugs I have found and fixed recently:

  1. Dimensions error parsing XBM image. Welcome to the XBM image format, a veritable dinosaur of image formats. It's a textual format and looks a bit like this:
    #define test_width 8
    #define test_height 14
    static char test_bits[] = {
    0x13, 0x00, 0x15, 0x00, 0x93, 0xcd, 0x55, 0xa5, 0x93, 0xc5, 0x00, 0x80,
    0x00, 0x60 };
    The WebKit XBM parsing code includes this line, to extract the width and height:
            if (sscanf(&input[m_decodeOffset], "#define %*s %i #define %*s %i%n",
    &width, &height, &count) != 2)
    return false;

    The XBM parser supports streaming (making render progress before you have the full data available), including streaming in the header. i.e. the above code will attempt to extract width and height from a partial XBM, and retry with more data if it fails. So what happens if the first network packet contains an XBM fragment of exactly the first 42 bytes of the above example? This looks like:
    #define test_width 8
    #define test_height 1
    I think you can see where this is going. The sscanf() sees two valid integers, and XBM decoding proceeds for an 8x1 image, which is incorrect. The real height, 14, had its ASCII representation fractured across a packet boundary.

  2. Out-of-bounds read skipping over HTML comments. This is best expressed in terms of part of the patch I submitted to fix it:
    --- WebCore/loader/TextResourceDecoder.cpp (revision 44821)
    +++ WebCore/loader/TextResourceDecoder.cpp (working copy)
    @@ -509,11 +509,13 @@ bool TextResourceDecoder::checkForCSSCha
    static inline void skipComment(const char*& ptr, const char* pEnd)
    const char* p = ptr;
    + if (p == pEnd)
    + return;
    // Allow <!-->; other browsers do.
    if (*p == '>') {
    } else {
    - while (p != pEnd) {
    + while (p + 2 < pEnd) {
    if (*p == '-') {
    // This is the real end of comment, "-->".
    if (p[1] == '-' && p[2] == '>') {

    As can be seen, some simple bounds checking was missing. In order to trigger, the browser would need to find itself processing an HTML fragment ending in:
    (Where "ending in" means not necessarily the end of the HTML, but the end of the HTML that we have received so far).
The general principle here? Software seems to have a lot of failure conditions with partial packets! This is unsurprising when you think about it; software is frequently trying to make progress based on partial information -- whether it's image or HTML parsers trying to show progress to the user, or network servers trying to extract a useful header or state transition from a short packet.
Typical fuzzing may not be able to trigger these conditions. I've certainly fuzzed image codecs using local files as inputs. This would never exercise partial packet code paths.
The best way to shake out these bugs is going to be to fuzz the packet boundaries at the same time as the packet data. Let me know if you find anything interesting :)

1 comment:

Jesse Ruderman said...

One neat thing about fuzzing packet boundaries is that you're not limited to the usual negative tests: the software doesn't crash, doesn't assert, doesn't hang. You can also make a pretty strong positive claim: the pixels displayed on the screen should be exactly the same regardless of the packet boundaries.