One day a guy asked me how to print a 2d string array in C. So I coded an example for him. But just for curiosity, I examined the assembly code. In C both string[0][1] and *(*string + 1) are the same. But in reality, the compiler writes the assembly code in 2 different ways. If we use string[0][1] it will directly move the value from the stack. When we dereference a pointer *(*string + 1) it will actually dereference the address pointed inside the register. This happens only in the MinGW GCC compiler. I compiled this using the latest on Windows which is 8.2.0-3 by the time I am writing this.
The assembly code in the left is this one.
#include <stdio.h> int main() { char *string[][2] = { {"Osanda","Malith"}, {"ABC","JKL"}, {"DEF","MNO"}, }; printf("%s %s\n", string[0][0], string[0][1]); }
The assembly code on the right is this.
#include <stdio.h> int main() { char *string[][2] = { {"Osanda","Malith"}, {"ABC","JKL"}, {"DEF","MNO"}, }; printf("%s %s\n", **string, *(*string + 1)); }
When I compiled for 64-bit it’s the same output I received under MinGW for Windows. I have included both printf lines in one program here.
Even though they do the same at a high level, at a low level there is a difference. Maybe that’s why there are 2 syntaxes for the same thing. For optimization?
If you write code for hardware, microcontrollers where optimization is important, use the [][] syntax always as it reduces instructions instead of pointer arithmetic.
I checked this under Linux and it’s the same optimization for GCC for both Windows and Linux. Tested with GCC 8.3. GCC is indeed a very intelligent compiler, it knows how to optimize. This disassembly is for 32-bit Linux.
# arr2.c:15: printf("%s %s\n", **string, *(*string + 1)); lea eax, -36[ebp] # _1, add eax, 4 # _1, # arr2.c:15: printf("%s %s\n", **string, *(*string + 1)); mov edx, DWORD PTR [eax] # _2, *_1 # arr2.c:15: printf("%s %s\n", **string, *(*string + 1)); lea eax, -36[ebp] # string.0_3, # arr2.c:15: printf("%s %s\n", **string, *(*string + 1)); mov eax, DWORD PTR [eax] # _4, MEM[(char * *)string.0_3] sub esp, 4 #, push edx # _2 push eax # _4 lea eax, .LC6@GOTOFF[ebx] # tmp102, push eax # tmp102 call printf@PLT # add esp, 16 #, # arr2.c:16: printf("%s %s\n", string[0][0], string[0][1]); mov edx, DWORD PTR -32[ebp] # _5, string mov eax, DWORD PTR -36[ebp] # _6, string sub esp, 4 #, push edx # _5 push eax # _6 lea eax, .LC6@GOTOFF[ebx] # tmp103, push eax # tmp103 call printf@PLT # add esp, 16 #, mov eax, 0 # _17,
This disassembly is for 64-bit Linux GCC.
# arr2.c:15: printf("%s %s\n", **string, *(*string + 1)); lea rax, -64[rbp] # _1, add rax, 8 # _1, # arr2.c:15: printf("%s %s\n", **string, *(*string + 1)); mov rdx, QWORD PTR [rax] # _2, *_1 # arr2.c:15: printf("%s %s\n", **string, *(*string + 1)); lea rax, -64[rbp] # string.0_3, # arr2.c:15: printf("%s %s\n", **string, *(*string + 1)); mov rax, QWORD PTR [rax] # _4, MEM[(char * *)string.0_3] mov rsi, rax #, _4 lea rdi, .LC6[rip] #, mov eax, 0 #, call printf@PLT # # arr2.c:16: printf("%s %s\n", string[0][0], string[0][1]); mov rdx, QWORD PTR -56[rbp] # _5, string mov rax, QWORD PTR -64[rbp] # _6, string mov rsi, rax #, _6 lea rdi, .LC6[rip] #, mov eax, 0 #, call printf@PLT # mov eax, 0 # _17,
When you compile in Visual C there’s no special optimization like GCC. It will properly dereference the registers and print it out.
; 15 : printf("%s %s\n", **string, *(*string + 1)); mov eax, 8 imul ecx, eax, 0 mov edx, DWORD PTR _string$[ebp+ecx+4] push edx mov eax, 8 imul ecx, eax, 0 lea edx, DWORD PTR _string$[ebp+ecx] mov eax, 4 imul ecx, eax, 0 mov edx, DWORD PTR [edx+ecx] push edx push OFFSET $SG4520 call _printf add esp, 12 ; 0000000cH ; 16 : printf("%s %s\n", string[0][0], string[0][1]); mov eax, 8 imul ecx, eax, 0 lea edx, DWORD PTR _string$[ebp+ecx] mov eax, 4 shl eax, 0 mov ecx, DWORD PTR [edx+eax] push ecx mov edx, 8 imul eax, edx, 0 lea ecx, DWORD PTR _string$[ebp+eax] mov edx, 4 imul eax, edx, 0 mov ecx, DWORD PTR [ecx+eax] push ecx push OFFSET $SG4521 call _printf add esp, 12 ; 0000000cH
In Borland C it will dereference the registers and print it out and there’s no difference in the two syntaxes.
Nice one. I saw this article on somewhere. But didn’t read it. Today randomly found it. 🙂
hey man i fallow you by lokking HackingLK episode
i want to go your path
help me bro