Programming in x86 Assembly on OSX
written on July 22, 2019
categories: engineering
Recently I've decided to learn x86 Assembly.
I know, it's not the most beautiful language to program in; I've already coded an entire terminal-based game in MIPS and it caused me a lot of headaches, but also it taught me many important things about programming and optimizing code. By looking into how your code gets translated into actual machine code, you can better undestand how to optimize it.
Besides, I'm studying embedded systems engineering; coding in assembly is a must-have skill, not because it's something we use every day, but because it can help us debug weird errors and make code run faster, especially on microcontrollers.
The Makefile
So here we go. First of all, I'm on a machine running OSX, so I'll need something that
can compile x86 Assembly code for my machine in particular. By doing some looking
around, I found out that I need to use NASM in order to compile Assembly code, and
then use ld
to link it. Let's set up a Makefile!
Makefile
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
|
Let's break this down: The first two lines have to do with setting up the name of the
file that contains the source code, as well as the output file's name. I also set up a
variable called ENTRY
which is basically the name of the entry point of the program;
in a normal C program that would be main
, but here we can call it whatever we want.
The ?=
operator means that if this variable is not set by the user when make
is
called, it will default to _start
.
Lines 4-7 set up our assembler and linker; as I said before, I'm using nasm
to
assemble my sources, along with a couple of arguments. We need to tell it that the
architecture we're compiling for is macho
(more on that here) and that we don't want it to optimize the output
code (hence the -O0
flag). This last point is important, as it will allow us to keep
more or less the same code before and after assembly, and will make debugging easier.
As far as ld
is concerned, first of all we need to give it our entry point (it will
look for _main
by default). I also add in a library called System which is,
apparently, necessary (ld
throws an error if I don't include it) and I also give it a
minimum OSX version to silence ld
's warnings about that.
The rest of the Makefile is pretty straightforward. All it does is basically assemble and link the source in two steps, as follows:
$ nasm -f macho -O0 hello.asm
$ ld -e _start -lSystem -macosx_version_min 10.8 hello.o -o hello
Let's look at the source code!
The source code
By doing some more looking around (visit the sources at the end of this post to learn
more about where I found all this info), I found out how to perform a basic syscall that
prints a string to the screen. If you don't know what a syscall is, here's the 30-second
version of it: whenever you're trying to do something that involves external processor
resources (like printing to stdout
, aka the terminal in this case) you need to call
the kernel with a certain opcode (operation code) while also having the right values in
the right registers (registers are basically little memory banks that sit literally
inside the CPU — their size is what makes your computer 32 or 64-bit).
In the case of the write
syscall, whose opcode is 0x4, the following registers need
to contain the following values:
- EDX needs to contain the string's length (in bytes);
- ECX needs to contain a pointer to the string;
- EBX needs to contain the file descriptor you're writing to (0x1 for
stdout
); and finally - EAX needs to contain the syscall opcode (0x4 for
write
).
These registers are special in the sense that they hold values that get used during a syscall.
Before I show you the assembly source code, there's another little thing we need to
talk about (if you're not familiar with assembly code): the .data
and .text
sections.
Every assembly source is typically split into two sections: .data
, that contains the
actual variables in the code (if you do something like int i = 0;
in C, it basically
ends up in this section — but only if it's a global variable or a static string),
and .text
, which contains the actual program (that manipulates whatever's in the
.data
section). Now let's look at the code!
First off, we need to declare two variables, one for the string we want to print and another that will store the string's length:
hello.asm
1 2 3 4 |
|
As you can see, the first line of this snippet declares the start of the .data
section. We then declare a bytes variable called msg
(hence db
, for declare
bytes), and add a newline character (0xa) at the end.
We then get the length of the msg
string in bytes in a pretty clever way: the $
operator points to the end of the last variable declaration in data, so in our case by
subtracting the msg
pointer (which points at the start of the msg
string) from $
we get the actual length of the string.
Let's now set up the actual program code:
hello.asm
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
|
As before, we start by declaring the section, then we make our entry point (_start
)
global, which means that it will be externally accessible — we need this in
order to be able to execute our program.
Right after that, I'm setting up a syscall flag; this behaves kind of like a small
function, in the sense that every time I call it it will perform a syscall (that's what
int 0x80
does) and it will then return to the normal flow of the program. It basically
lets me replace int 0x80
by _syscall
, which makes a bit more sense to the programmer
reading this code and requires a bit less brainwork to figure out what it does.
We then head right into _start
which is, as we said, the entry point of our program
(much like a main()
function). We set up the registers according to the syscall's
specifications, and we call the kernel.
The last bit of code makes sure the program exits properly, by making a syscall with the sys_exit opcode. The return value of the program is the value stored in EBX (that's why we put 0x0 in there before making the syscall, as 0 means that the program was executed successfully). Let's assemble and link it, and see if it works!
Something's wrong
$ make
nasm -f macho -O0 hello.asm
ld -e _start -lSystem -macosx_version_min 10.8 hello.o -o hello
$ ./hello
$
Hmm, that's weird... nothing's getting printed to the screen. Let's check the value that the program returned:
$ echo $?
81
That's even weirder; it was supposed to return 0. Let's examine the program with lldb
,
which is the equivalent of gdb
for OSX:
$ lldb ./hello
(lldb) target create "./hello"
Current executable set to './hello' (i386).
(lldb) b start
Breakpoint 1: 2 locations.
(lldb) r
Process 23319 launched: '/Users/kokkonisd/code/x86nasm/hello' (i386)
Process 23319 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
frame #0: 0x00001fc9 hello`start
hello`start:
-> 0x1fc9 <+0>: movl $0xe, %edx
0x1fce <+5>: movl $0x2000, %ecx ; imm = 0x2000
0x1fd3 <+10>: movl $0x1, %ebx
0x1fd8 <+15>: movl $0x4, %eax
Target 0: (hello) stopped.
This might look weird, but all I've done is set a breakpoint in start, which will make the debugger stop the program when it gets there; that will allow us to take a good look at the register values to see why we're not printing anything to the screen.
As you can see, there's a small arrow (->
) pointing to the line that's going to be run
next; here, it's the initialization of the EDX register. It's about to be set to
0xe
, which is good, because 0xe
in decimal is 14 (which is precisely the length of
our string in bytes). Let's advance a little bit by hitting n
, and stop right before
the syscall:
(lldb) n
...
(lldb) n
Process 23319 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = instruction step over
frame #0: 0x00001fdd hello`start + 20
hello`start:
-> 0x1fdd <+20>: calll 0x1fc6 ; syscall
0x1fe2 <+25>: movl $0x0, %ebx
0x1fe7 <+30>: movl $0x1, %eax
0x1fec <+35>: calll 0x1fc6 ; syscall
Target 0: (hello) stopped.
We've now stopped right before the first syscall, which means that all the registers should contain the right values. Let's print them:
(lldb) p $edx
(unsigned int) $0 = 14
(lldb) p (char *) $ecx
(char *) $1 = 0x00002000 "Hello, World!\n"
(lldb) p $ebx
(unsigned int) $2 = 1
(lldb) p $eax
(unsigned int) $3 = 4
As you can see, everything seems to be normal. All the registers are initialized with
the correct values, so... why aren't we seeing anything in stdout
?
Why it doesn't work
As it turns out, 32-bit MacOS system calls on BSD do not use registers, except for the system call
number in EAX. This means that our register values get completely ignored by the
kernel, and thus it doesn't print anything; this is also the reason why the return value
has nothing to do with what we put in EBX. What I'd found out about how the
sys_write
syscall works was right, but only on Linux. So, how can we make it work on
OSX?
The solution
On OSX, the kernel uses values that have been pushed on the stack to execute a syscall; so let's do that:
hello.asm
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 |
|
I've only really changed lines 23-25, 29 and 31, the rest being the same code as before plus some comments to better explain how it works. Let's test it out:
$ make
nasm -f macho -O0 hello.asm
ld -e _start -lSystem -macosx_version_min 10.8 hello.o -o hello
$ ./hello
Hello, World!
$ echo $?
0
As we can see, by pushing words (16 bits, or 2 bytes) to the stack, we now have a functioning program that performs both syscalls as intended.
Conclusion
The point of all this is pretty simple: when coding in Assembly, you're coming by definition closer to the hardware of the machine, so the code is starting to get more and more processor-dependent. Furthermore, the code also depends on the OS; as the nice people that commented on my Stack Overflow question pointed out, you can't even be sure that a syscall will have the same opcode on Linux and OSX. Even if they're both UNIX-based, their kernels still differ and will thus need different code at some point.
Sources
Here are some sources that helped me put all of this together:
- https://asmtutor.com/#
- http://www.cs.virginia.edu/~evans/cs216/guides/x86.html
- https://stackoverflow.com/questions/47494744/how-does-work-in-nasm-exactly
- http://zathras.de/angelweb/blog-learn-nasm-on-os-x.htm
- http://web.archive.org/web/20070830122645/http://untimedcode.com/2007/5/20/learn-nasm-assembly-on-mac-os-x
- https://stackoverflow.com/questions/57142095/cant-mov-directly-to-registers-x86-assembly-on-osx?noredirect=1#comment100800395_57142095