You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A struct { a: f64, b: f64 } will be returned like this:
XMM0.lo64 = a;
XMM1.lo64 = b;
What is the "red zone"?
Signals arrive asynchronously in Linux (& Unix in general), which means your function could be mid-execution when a signal arrives, which causes it to jump to a signal handler*.
Doesn't sound great, does it? Well, the signal handler reuses your stack, but it doesn't want to clobber your stack-variables.
A "leaf function" is a minimized type of function that even avoids allocating stack space (which is slight overhead).
A leaf function is free to use anything from rsp-0 to rsp-128 without moving the stack pointer.
So if a signal arrives in the middle of a leaf function, it has to avoid using anything above rsp-128.
So now your leaf function can be minimal and your signal handler won't clobber its variables.
In the Microsoft x64 calling convention, it is the caller's responsibility to allocate 32 bytes of "shadow space" on the stack right before calling the function (regardless of the actual number of parameters used), and to pop the stack after the call.
...
Stack aligned on 16 bytes. 32 bytes shadow space on stack.
Note that space is always allocated for the register parameters, even if the parameters themselves are never homed to the stack; a callee is guaranteed that space has been allocated for all its parameters.
...
The stack will always be maintained 16-byte aligned, except within the prolog (for example, after the return address is pushed), and except where indicated in Function Types for a certain class of frame functions.
Okay, so we need at least sub rsp, 32 for "shadow space" but since a call FunctionHere will also put the return address on the stack, we'll need to sub/push another 8 bytes to the stack to align it to 16 bytes.
TL;DR for every 'caller':
allocate 32 bytes for "shadow space"
optional: allocate X bytes (round up to a 16-byte multiple) for function parameters that overflow onto the stack
((num_stack_params * 8) + 15) / 16
This could mean just pushing an extra/random register, or allocating the stack space at the beginning of the function when allocating "shadow space" and such.
because call FunctionHere will push the return address (8 bytes) to the stack, allocate an extra 8 bytes in the new function so the stack is aligned to 16 bytes.
The options usually seen are:
function: ; push a callee-saved register which this function will clobber & the push will also realign stack to 16 bytespushrbx ; allocate "shadow space" since we're going to be calling something (that needs it / non-leaf func / etc)subrsp,32 ; ; do stuff here like clobber the saved register & also calling something ; ; free the "shadow space"addrsp,32 ; welcome back callee-saved registerpoprbx ; and returnret
or
function: ; allocate "shadow space" since we're going to be calling something ; & also allocate another 8 bytes so we realign the stack to 16 bytessubrsp,32+8 ; sometimes you see things like `sub rsp, 5*8` ; ; do stuff here like calling something ; ; free the "shadow space" and the extra 8 bytes we used to realign the stackaddrsp,32+8 ; sometimes you see things like `add rsp, 5*8` ; and returnret
Let's check out the beginning of kernel32!WriteFile:
WriteFile: ; cache some registers in the "shadow stack" (and it's interesting that they skip `[stack+8]`)mov qword ptr ss:[rsp+0x10],rbxmov qword ptr ss:[rsp+0x18],rsimov qword ptr ss:[rsp+0x20],r9 ; cache rdi since it will be clobbered -- THIS REALIGNS THE STACK TO 16 bytespushrdi ; allocate "shadow space" (0x20) for subsequent calls & allocate an extra 0x40 bytes for stack variablessubrsp,0x60
Here's what kernel32!WriteConsoleA looks like:
WriteConsoleA: ; + 32 bytes to allocate "shadow space" ; + 16 bytes since we need to put an argument on the stack for ; the inner-WriteConsoleA function & then round up to a 16-byte multiple ; + 8 bytes to realign the stack once the return address is on the stack after the `call` ; = 56 (0x38)subrsp,0x38 ; 5th parameter to the inner-WriteConsoleA functionmov byte ptr ss:[rsp+0x20],0x0call kernelbase.7FFB6CDC6D78testeax,eaxjs kernelbase.7FFB6CDC6D59moveax,0x1jmp kernelbase.7FFB6CDC6D69movecx,eaxcall qword ptr ds:[<RtlSetLastWin32ErrorAndNtStatusFromNtStatus>]nop dword ptr ds:[rax+rax],eaxxoreax,eax ; free "shadow space" plus the 8 bytes used to realign the stack to 16 bytesaddrsp,0x38ret
Assemblers (FASM, at least) often have their function invocation macros automatically handle the "shadow stack", and even allocating enough for multiple functions you call.
invoke glVertex3d,float0.6,float-0.6,float0.0 invoke glVertex2f,floatdword 0.1,floatdword 0.2; The stack space for parameters are allocated before each call and freed immediately after it.; However it is possible to allocate this space just once for all the calls inside some; given block of code, for this purpose there are frame and endf macros provided. They; should be used to enclose a block, inside which the RSP register is not altered between; the procedure calls and they prevent each call from allocating stack space for parameters,; as it is reserved just once by the frame macro and then freed at the end by the endf macro. frame ; allocate stack space just once invoke TranslateMessage,msg invoke DispatchMessage,msg endf
FASM will also automatically prepend sub rsp, 8 if you're using the .code macro.