#include<stdio.h>
int main()
{
printf("hello,world\n");
return 0;
}
x86
use MSVC Compiler
cl 1.cpp /Fa 1.asm
/Fa Option causes the compiler to generate an assembly manifest file (assembly listing file), And specifies that the name of the assembly list file is 1.asm
1.asm The contents are as follows :
CONST SEGMENT
$SG3830 DB ‘hello,world',0AH,00H
CONST END
PUBLIC _main
EXTRN _printf:PROC;Function compile flags:/0dtp
_TEXT SEGMENT
_main PROC
push ebp
mov ebp,esp
push OFFSET $SG3830
call _printf
add esp,4
xor eax,eax
pop ebp
ret 0
_main ENDP
_TEXT ENDS
In generation 1.asm after , The compiler generates 1.obj Then link it to an executable 1.exe
CONST: Data segment
_TEXT: Code snippet
The above source code is equivalent to :
#include <stdio.h>
const char *$SG3830[] = "hello,world\n";
int main()
{
printf($SG3830);
return 0;
}
We found that the compiler added hexadecimal digits to the end of the string constant 0, Namely 00h, Adds an end flag to a string constant .
adopt PUSH instructions , The program pushes a pointer to a string onto the stack . such ,printf() Function can call the pointer in the stack , That is, string “hello,world\n" Address of .
stay printf() After the end of the function , The control flow of the program returns to the main() Function . here , The string address remains in the data stack . At this point, you need to adjust the pointer ESP Register to release the pointer .
add ESP,4 hold ESP The value in the register is added 4
Why add 4, that is because x86 Memory address usage of the platform 32 Bit data description . In the same way , stay x64 When the pointer is released on the system ,ESP It's going to be added 8.
therefore , This directive can be understood as POP A register . It's just that the instruction in this example directly discards the data in the stack POP The instruction also stores the value in the register to a given register .
printf() After the end of the function ,main() The function returns 0. Namely main() The result of the operation of the function is 0.
This return value is returned by the command XOR EAX,EAX Calculated .
gcc generate hello world program
gcc 1.c -o 1
Assembly instruction
Main proc near
var_10 = dword ptr -10h
push ebp
mov ebp,esp
and esp,0FFFFFFF0h
sub esb,10h
mov eax,offsett aHelloWorld; "hello,world\n"
mov [esp+10h+var_10],eax
call _printf
mov eax,0
leave
retn
main endp
AND
ESP,0FFFFFFF0h instructions , It makes the stack address ESP Value direction of 16 Byte edge alignment , become 16 Integral multiple of , Belongs to initialization instruction . If the address bits are not aligned , that CPU You may need to access memory twice to get the data in the stack . Although in the 8 Byte boundary alignment can be satisfied 32 position x86
CPU and 64 position x64 CPU Requirements of , However, the Compilation Rules of mainstream compilers stipulate that ” The address that the program accesses must be directed to 16 byte alignment “.
SUB ESP,10h Will be allocated in the stack 0x10
bytes, Namely 16 byte . This program only uses 4 Byte space . But because of the compiler's stack address ESP Yes 16 byte alignment , So it's distributed every time 16 Byte space .
then , The program writes the string address directly to the data stack . among var_10 Is a local variable , For the back printf() Function transfer parameters .
The last one LEAVE instructions , Equivalent to MOV ESP,EBP and POP EBP Two instructions .
GCC Other features of
#include<stdio.h>
int f1()
{
printf("world\n");
}
int f2()
{
printf("hello world\n");
}
int main()
{
f1();
f2();
}
Assembly instruction
f1 proc near
s =dowrd ptr-1ch
sub esp,1Ch
mov [esp+1Ch+s],offset s; "world\n"
call _puts
add esp,1Ch
retn
f1 endp
f2 proc near
s =dword ptr-1ch
sub esp,1Ch
mov [esp+1Ch+s],offset aHello;"hello ”
call _puts
add esp,1Ch
retn
f2 endp
aHello db 'hello'
s db 'world',0xa,0
In print string “hello
world" When , The two word pointer addresses are actually adjacent . Calling puts() Function output , The function itself does not know that the string it outputs is divided into two parts . In fact, we can see it in the assembly instruction list , The two strings are not actually separated .
stay f1() function call
puts Function time , It outputs a string ”world" And plus Terminator , because puts() The function does not know that a string can be concatenated with the previous string to form a new string .GCC Will make full use of this technology to save memory .
ARM
No optimization enabled ARM pattern
armcc.exe --arm --c90 -O 0 1.c
main
STMFD SP!{R4,LR}
ADR R0,aHelloWorld; "hello, world"
BL __2printf
MOV R0,#0
LDMFD SP!{R4,PC}
aHelloWorld DCB "hello,world",0
STMFD SP!{R4,LR} amount to x86r Of PUSH instructions . It puts R4 Registers and LR Link
Register The value of the register is placed in the data stack . This directive will first SP Decline , Allocate a new space in the stack for storage R4 and LR Value of .
ADR R0,aHelloWorld First of all, it is right PC Value operation , And then put “hello,world" Offset of string and PC Add the values of , Store its results in R0 in .
BL __2printf call printf() function .BL Specific operation :
1) The address of the next instruction , The address 0xC place MOV R0,#0 Address of , write in LR register
2) Then the printf() The address of the function write in PC register , To boot the system to execute the function
MOV R0,#0 take R0 Register setting 0
LDMFD SP!R4,PC This and the order . It takes the values out of the stack , Assign values to R4 and PC, And adjust the stack pointer SP.
Technology