This causes the compiler to dynamically align the stack to meet your specifications. However, dynamically adjusting the stack at run time may cause slower execution of your application. Volatile registers are scratch registers presumed by the caller to be destroyed across a call. Nonvolatile registers are required to retain their values across a function call and must be saved by the callee if used.
On function exit and on function entry to C Runtime Library calls and Windows system calls, the direction flag in the CPU flags register is expected to be cleared. For details on stack allocation, alignment, function types and stack frames on x64, see x64 stack usage. Every function that allocates stack space, calls other functions, saves nonvolatile registers, or uses exception handling must have a prolog whose address limits are described in the unwind data associated with the respective function table entry, and epilogs at each exit to a function.
For details on the required prolog and epilog code on x64, see x64 prolog and epilog. One of the constraints for the x64 compiler is to have no inline assembler support. Certain functions are performance sensitive while others are not. Performance-sensitive functions should be implemented as intrinsic functions. The intrinsics supported by the compiler are described in Compiler Intrinsics. Executable images both DLLs and EXEs are restricted to a maximum size of 2 gigabytes, so relative addressing with a bit displacement can be used to address static image data.
This data includes the import address table, string constants, static global data, and so on. Calling Conventions. A single argument is never spread across multiple registers. The x87 register stack is unused. It may be used by the callee, but consider it volatile across function calls. All floating point operations are done using the 16 XMM registers. Parameter passing is described in detail in Parameter passing. For prototyped functions, all arguments are converted to the expected callee types before passing.
The caller is responsible for allocating space for the callee's parameters. The caller must always allocate sufficient space to store four register parameters, even if the callee doesn't take that many parameters.
For vararg or unprototyped functions, any floating point values must be duplicated in the corresponding general-purpose register. Any parameters beyond the first four must be stored on the stack after the shadow store before the call. Vararg function details can be found in Varargs. Unprototyped function information is detailed in Unprototyped functions. Most structures are aligned to their natural alignment. The primary exceptions are the stack pointer and malloc or alloca memory, which are byte aligned to aid performance.
Alignment above 16 bytes must be done manually. Since 16 bytes is a common alignment size for XMM operations, this value should work for most code. For more information about structure layout and alignment, see Types and Storage. For information about the stack layout, see x64 stack usage. Leaf functions are functions that don't change any non-volatile registers. A non-leaf function may change non-volatile RSP, for example, by calling a function. Or, it could change RSP by allocating additional stack space for local variables.
To recover non-volatile registers when an exception is handled, non-leaf functions are annotated with static data. The data describes how to properly unwind the function at an arbitrary instruction. A shorter document on x64 in general where there is some information also about the calling convention is this cheat sheet.
After the parameters for a function are computed, they are classified. The first parameter of a function has the lowest address. If there are more than 6 arguments, the stack is used. I want to note that the convention does not change that much if you are studying security, but knowing it helps testing buffer overflows on actual machines, without compiling code for 32 bit systems, that nowadays are a small minority.
One common case in fact is a description of the stack frame that in many cases especially the simplest used in some books is completely different from what it is possible to get on your computer. This happens, or at least happened to me, since I learned the textbook version of the arguments passed on the stack without diving into real life cases, on 64bit architectures. The bit calling convention does, in general, seem to increase the stack consumption of the program.
However, there are a couple of things that help to reduce the stack consumption. However, it still seems odd at least to me that Microsoft did not change the default stack size for applications when compiled as bits: by default both bit and bit applications are given a 1Mb stack.
If your existing bit program gets anywhere near this stack limit you may find the bit equivalent needs a bigger stack obviously this is very dependent on the exact call pattern of your program. This is a possible sequence of instructions for setting up the stack frame when foo is called in a bit application:. As you can see the bit code is simpler than the bit code because most things are done with the mov instruction rather than using push and pop.
Since the stack pointer register rsp does not change once the prolog is completed it can be used as the pointer to the stack frame, which releases the rbp register to be a general-purpose register.
Note too that the bit code only updates the relevant part of the register and memory location. This has the unfortunate effect that, if you are writing tools to analyse a running program or are debugging code to which you do not have the complete source, you cannot as easily tell the actual value of function arguments as the complete value in the bit register or memory location may include artefacts from earlier.
In the bit case, when an 8bit char was pushed into the stack, the high 24bits of the bit value were set to zero. One other change in the bit convention is that the stack pointer must outside the function prolog and epilog always be aligned to a multiple of 16 bytes not, as you might at first expect, 8 bytes to match the word size.
This helps to optimise use of the various instructions that read multiple words of memory at once, without requiring each function to align the stack dynamically.
Finally note that the bit convention means that the called function returns with the stack restored to its value on entry. This means function calls can be made with a variable number of arguments and the caller will ensure the stack is managed correctly. This passes up to six bit or bit values using the SSE2 registers xmm and ymm.
The bit bit convention dictates that the first four arguments are passed in fixed registers. These registers, for integral and pointer values are rcx , rdx , r8 and r9. For floating point values the arguments are passed in xmm0 — xmm3.
The older x87 FPU registers are not used to pass floating point values in bit mode. If there is a mix of integral and floating point arguments the unused registers are normally simply skipped, for example passing a long , a double and an int would use rcx , xmm1 and r8. However, when the function prototype uses ellipses i. When a member function is called, the this pointer is passed as an implicit argument; it is always the first argument and hence is passed in rcx.
The overall register conventions in the x64 world are quite clearly defined. The documentation [ Register Usage ] describes how each register is used and lists which register values must be retained over a function call and which ones might be destroyed. In the bit world arguments passed by value were simply copied onto the stack, taking up as many complete bit words of stack space as required. The resulting temporary variable and any padding bytes would be contiguous in memory with the other function arguments.
Note that this passing by reference is transparent to the source code. Additionally the caller must ensure that any temporary so allocated is on a 16byte aligned address. The compiler will reserve stack space for local variables whether named or temporary unless they can be held in registers. However it will re-order the variables for various reasons — for example to pack two int values next to each other.
0コメント