- Calling Assembly Code
- Stack Management for ARM
- Conditional Execution / Assembly or C?
Stack Management for ARM
The other alternative, as I hinted earlier, is to write the assembly as a stand-alone function. This same example can be written like this:
.syntax unified .global atomic_add_or_fail .type atomic_add_or_fail, %function atomic_add_or_fail: ldrex r2, [r0] add r2, r2, r1 strex r1, r2, [r0] eor r0, r1, #1 bx lr
This version uses different registers, because the arguments are passed in r0 and r1. r2 and r3 are safe to clobber because theyalong with the other argument registersare caller-save. It also needs some directives at the top. The first will probably go in any ARM assembly that you write; it tells it that you are using the new syntax where both ARM and Thumb-2 instruction encodings use the same mnemonics.
This version needs to more instructions or, at least, two more explicit instructions. The first, eor, is an exclusive-or, which we use with a literal 1 to perform a logical negation. This was done in C in the other version.
The other instruction, bx, is a branch-and-exchange instruction. This is a peculiarity of ARM, which you won't find on most other RISCy architectures. In this version, you could probably replace it with:
mov pc, lr
This would move the value in the link register to the program counter. Both of these are general-purpose registers in 32-bit ARM, so this is a return instruction: The bl (branch and link) or blx (branch, link, and exchange) instructions store the program counter in the link register while doing the branch.
The exchange versions of these toggle the mode of the decoder. ARM supports two instruction sets: ARM and Thumb. In ARM mode, all instructions are 32 bits. In Thumb mode, they are either all 16 bits (for Thumb-1) or a mixture of 16 and 32 bits for Thumb-2 mode. ARM functions are all at least two-word aligned, so the low bit in a function pointer (or, in fact, any branch destination) is used to indicate whether the function is ARM or Thumb code. The bx and blx functions jump to the address and switch to the correct mode.
To call a function, you will typically put the arguments in r0-r3, and any other ones on the stack, and then use blx to jump to the address.
Inside a function, you may want to save some callee-save registers so you can use them. ARM provides a very convenient pair of instructions for doing this: load and store multiple. These are actually very powerful. They allow a subset of registers to be stored at an offset from another register and to either increment or decrement the target register by an amount indicating the stored space. These instructions have the mnemonics stm and ldm, followed by a two-character code indicating whether the first register is incremented or decremented and whether this happens before or after each store/load. Fortunately, for manipulating the stack pointer, the assembler understands push and pop as aliases for these:
push {r4-r6} .save {r4-r6} ... pop {r4-r6}
What is that .save directive all about? It's used for generating unwind tables. You should use .fnstart and .fnend at the start and end of your functions if they call any other functions, along with the .save directive when you save registers to the stack. These allow the unwind library to unwind through your function, which allows things like exception propagation and stack trace generation.