Intel x86 Assembly Language & Microarchitecture

Topics related to Intel x86 Assembly Language & Microarchitecture:

Getting started with Intel x86 Assembly Language & Microarchitecture

This section provides an overview of what x86 is, and why a developer might want to use it.

It should also mention any large subjects within x86, and link out to the related topics. Since the Documentation for x86 is new, you may need to create initial versions of those related topics.

Register Fundamentals

Assemblers

Optimization

Paging - Virtual Addressing and Memory

Calling Conventions

Resources

Overviews/comparisons: Agner Fog's nice calling convention guide. Also, x86 ABIs (wikipedia): calling conventions for functions, including x86-64 Windows and System V (Linux).


Converting decimal strings to integers

Converting strings to integers is one of common tasks.

Here we'll show how to convert decimal strings to integers.

Psuedo code to do this is:

function string_to_integer(str):
    result = 0
    for (each characters in str, left to right):
        result = result * 10
        add ((code of the character) - (code of character 0)) to result
    return result

Dealing with hexadecimal strings is a bit more difficult because character codes are typically not continuous when dealing with multiple character types such as digits(0-9) and alphabets(a-f and A-F). Character codes are typically continuous when dealing with only one type of characters (we'll deal with digits here), so we'll deal with only environments in which character codes for digit are continuous.

Real vs Protected modes

Control Flow

Multiprocessor management

In order to access the LAPIC registers a segment must be able to reach the address range starting at APIC Base (in IA32_APIC_BASE).
This address is relocatable and can theoretically be set to point somewhere in the lower memory, thus making the range addressable in real mode.

The read/write cycles to the LAPIC range are not however propagated to the Bus Interface Unit, thereby masking any access to the addresses "behind" it.

It is assumed that the reader is familiar with the Unreal mode, since it will be used in some example.

It is also necessary to be proficient with:

  • Handling the difference between logical and physical addresses1
  • Real mode segmentation.
  • Memory aliasing, id est the ability to use different logical addresses for the same physical address
  • Absolute, relative, far, near calls and jumps.
  • NASM assembler, particularly that the ORG directive is global. Splitting the code into multiple files greatly simplify the coding as it will be possible to give different section different ORGs.

Finally, we assume the CPU has a Local Advanced Programmable Interrupt Controller (LAPIC).
If ambiguous from the context, APIC always means LAPIC (e not IOAPIC, or xAPIC in general).


References:

Bitfields
                     Spurious Interrupt Vector Register
                    Interrupt Command Register
       Local APIC ID Register
                       IA32_APIC_BASE
MSR nameAddress
IA32_APIC_BASE1bh

1 If paging will be used, virtual addresses also come into play.

System Call Mechanisms

Data Manipulation