This section provides an overview of what x86 is, and why a developer might want to use it.
It should also mention any large subjects within x86, and link out to the related topics. Since the Documentation for x86 is new, you may need to create initial versions of those related topics.
When in doubt, you can always refer to the pretty comprehensive Intel 64 and IA-32 Architectures Optimization Reference Manual, which is a great resource from the company behind the x86 architecture itsself.
Overviews/comparisons: Agner Fog's nice calling convention guide. Also, x86 ABIs (wikipedia): calling conventions for functions, including x86-64 Windows and System V (Linux).
SystemV x86-64 ABI (official standard). Used by all OSes but Windows. (This github wiki page, kept up to date by H.J. Lu, has links to 32bit, 64bit, and x32. Also links to the official forum for ABI maintainers/contributors.) Also note that clang/gcc sign/zero extend narrow args to 32bit, even though the ABI as written doesn't require it. Clang-generated code depends on it.
SystemV 32bit (i386) ABI (official standard) , used by Linux and Unix. (old version).
OS X 32bit x86 calling convention, with links to the others. The 64bit calling convention is System V. Apple's site just links to a FreeBSD pdf for that.
Windows x86-64 __fastcall
calling convention
Windows __vectorcall
: documents the 32bit and 64bit versions
Windows 32bit __stdcall
: used used to call Win32 API functions. That page links to the other calling convention docs (e.g. __cdecl
).
Why does Windows64 use a different calling convention from all other OSes on x86-64?: some interesting history, esp. for the SysV ABI where the mailing list archives are public and go back before AMD's release of first silicon.
Converting strings to integers is one of common tasks.
Here we'll show how to convert decimal strings to integers.
Psuedo code to do this is:
function string_to_integer(str):
result = 0
for (each characters in str, left to right):
result = result * 10
add ((code of the character) - (code of character 0)) to result
return result
Dealing with hexadecimal strings is a bit more difficult because character codes are typically not continuous when dealing with multiple character types such as digits(0-9) and alphabets(a-f and A-F). Character codes are typically continuous when dealing with only one type of characters (we'll deal with digits here), so we'll deal with only environments in which character codes for digit are continuous.
In order to access the LAPIC registers a segment must be able to reach the address range starting at APIC Base (in IA32_APIC_BASE).
This address is relocatable and can theoretically be set to point somewhere in the lower memory, thus making the range addressable in real mode.
The read/write cycles to the LAPIC range are not however propagated to the Bus Interface Unit, thereby masking any access to the addresses "behind" it.
It is assumed that the reader is familiar with the Unreal mode, since it will be used in some example.
It is also necessary to be proficient with:
ORG
directive is global. Splitting the code into multiple files greatly simplify the coding as it will be possible to give different section different ORGs.Finally, we assume the CPU has a Local Advanced Programmable Interrupt Controller (LAPIC).
If ambiguous from the context, APIC always means LAPIC (e not IOAPIC, or xAPIC in general).
References:
Bitfields |
---|
MSR name | Address |
---|---|
IA32_APIC_BASE | 1bh |
1 If paging will be used, virtual addresses also come into play.
mov is used to transfer data between the registers.