When in doubt, you can always refer to the pretty comprehensive Intel 64 and IA-32 Architectures Optimization Reference Manual, which is a great resource from the company behind the x86 architecture itsself.
The obvious way to zero a register is to MOV in a 0—for example:
B8 00 00 00 00 MOV eax, 0
Notice that this is a 5-byte instruction.
If you are willing to clobber the flags (MOV never affects the flags), you can use the XOR instruction to bitwise-XOR the register with itself:
33 C0 XOR eax, eax
This instruction requires only 2 bytes and executes faster on all processors.
If the Carry (C) flag holds a value that you want to put into a register, the naïve way is to do something like this:
mov al, 1
jc NotZero
mov al, 0
NotZero:
A more direct way, avoiding the jump, is to use "Subtract with Borrow":
sbb al,al ; Move Carry to al
If C is zero, then al will be zero. Otherwise it will be 0xFF (-1). If you need it to be 0x01, add:
and al, 0x01 ; Mask down to 1 or 0
To find out if a register holds a zero, the naïve technique is to do this:
cmp eax, 0
But if you look at the opcode for this, you get this:
83 F8 00 cmp eax, 0
test test eax, eax ; Equal to zero?
Examine the opcode you get:
85 c0 test eax, eax
You can also have a look into the Q&A Question on this technique.
In 32-bit Linux, system calls are usually done by using the sysenter instruction (I say usually because older programs use the now deprecated int 0x80) however, this can take up quite alot of space in a program and so there are ways that one can cut corners in order to shorten and speed things up.
This is usually the layout of a system call on 32-bit Linux:
mov eax, <System call number>
mov ebx, <Argument 1> ;If applicable
mov ecx, <Argument 2> ;If applicable
mov edx, <Argument 3> ;If applicable
push <label to jump to after the syscall>
push ecx
push edx
push ebp
mov ebp, esp
sysenter
That's massive right! But there are a few tricks we can pull to avoid this mess.
The first is to set ebp to the value of esp decreased by the size of 3 32-bit registers, that is, 12 bytes. This is great so long as you are ok with overwriting ebp, edx and ecx with garbage (such as when you will be moving a value into those registers directly after anyway), we can do this using the LEA instruction so that we do not need to affect the value of ESP itself.
mov eax, <System call number>
mov ebx, <Argument 1>
mov ecx, <Argument 2>
mov edx, <Argument 3>
push <label to jump to after the syscall>
lea ebp, [esp-12]
sysenter
However, we're not done, if the system call is sys_exit we can get away with not pushing anything at all to the stack!
mov eax, 1
xor ebx, ebx ;Set the exit status to 0
mov ebp, esp
sysenter
To get the product of a register and a constant and store it in another register, the naïve way is to do this:
imul ecx, 3 ; Set ecx to 5 times its previous value
imul edx, eax, 5 ; Store 5 times the contend of eax in edx
leaMultiplications are expensive operations. It's faster to use a combination of shifts and adds. For the particular case of muliplying the contend of a 32 or 64 bit register that isn't esp or rsp by 3 or 5, you can use the lea instruction. This uses the address calculation circuit to calculate the product quickly.
lea ecx, [2*ecx+ecx] ; Load 2*ecx+ecx = 3*ecx into ecx
lea edx, [4*edx+edx] ; Load 4*edx+edx = 5*edx into edx
Many assemblers will also understand
lea ecx, [3*ecx]
lea edx, [5*edx]
For all possible multiplicands other them ebp or rbp, the resulting instruction lengh is the same as with using imul.
ebp or rbp it takes one byte more them using imul