x86 Golf

Recently I’ve been doing x86 golfing again, an on-and-off hobby of mine. There’s something appealing about writing the shortest program that the CPU executes directly, without compiler or interpreter. To make my point, consider one line of python relies on hundreds of lines of internal python, implemented in tens of thousands of lines of C, which compiles to millions of assembly instructions to have a working interpreter. Without linking in libc for useful things like printf, you pretty much just have the assembly instructions and kernel syscalls. To go any lower than syscalls would basically be writing a bootloader or something that runs without an OS. What else is fascinating about x86 is, despite being around 50 years old, most people’s desktop computers have an x86 processor, which while being a modern 64-bit system, still has familiar instructions dating back to the Intel 8086, operating on registers that have been extended from 8/16-bit to 32-bit to 64-bit and beyond.

What else is great about x86 is, unlike a more reasonable ISA like MIPS which has a fixed instruction length of 32 bits making decoding instructions very fast, the 8086 was designed to have variable length instructions. So there are one byte instructions that operate on any of the eight general-purpose registers, but many instructions depend on specific registers. For example, MUL multiplies EAX with any register or value in memory, and stores the result over two registers, EDX for the upper half and EAX for the lower half. If you want to shift a value in a register by a variable amount, that amount has to be in the CL register (the lowest 8 bits of the ECX register). Those are reasonable instructions. There are also arcane ones like the famous LOOP, which will decrement ECX as a counter, then short (relative) jump if the counter is not zero. (You will never see a modern compiler produce loop because it is horrendously slow, and the catch-22 is since no one uses it, Intel engineers never bothered to make it faster.) EAX is designated the accumulator, EBX the base register, ECC the counter, EDX the data register, and so on. Many simple operations are one byte shorter if they operate on EAX, some looping instructions have special forms for ECX, and some string instructions rely on ESI and EDI, standing for Source Index and Destination Index. Most instructions you can use any registers, but for efficiency purposes or to make the code as short as possible, you want to pick and juggle your registers carefully. Compilers are really good at optimizing for speed, but most simple functions can be beaten in shortness by a human willing to use obscure instructions and breaking the usual calling conventions.

In future posts I’ll go over more on what I picked up on x86.