- Calling Assembly Code
- Stack Management for ARM
- Conditional Execution / Assembly or C?
Conditional Execution
One of the unusual things about the ARM architecture is that a lot of instructions can be suffixed with a condition code and only execute if the condition is met. This allows much denser code than if you use conditional jumps. For example, you can write code like this:
mov r1, #42 teq r0, #0 ldrne r1, [r0]
This is equivalent to the C code:
r1 = 42; if (r0 != 0) { r1 = *r0; }
It stores the immediate value 42 in register 1, then tests if register 0 contains the value 0. If it doesn't, then it loads the value from the address in register 0 into register 1.
The C code, naively, would be translated to a branchsomething like this:
mov r1, #42 teq r0, #0 beq after_if ldr r1, [r0] after_if:
This does the same thing, but by jumping over the load instruction. The disadvantage of this is that the processor has to do branch prediction. The first version, using conditional execution, allows the processor to always execute the instruction, but simply not retire the result if it ends up doing something that it shouldn't have done. As long as r1 isn't used immediately afterwards, this can be very fast.
Assembly or C?
Hopefully this article has given you a brief overview of ARM assemblyenough that you can just grab the ARM instruction set quick reference card and play with ARM assembly. This still leaves the question of whether you should. In general, if you can implement something in C, rather than assembly, then you should, at least at first.
The C version is useful for three reasons. The first is portability. The second is as a benchmark: If your assembly version isn't faster, then there is no point in it. The third is as a reference; a C implementation is a fairly low-level specification of what the assembly version should be doing. You can refer to it while implementing the assembly code. Once you have C and assembly versions, you can compare their performance, and use the C one if it turns out that your compiler is good enough.
Of course, there are some situations where C is not expressive enough to do what you want. atomic add was only possible with compiler extensions before C11, and the example from earlier that is permitted to fail has to be written with a compare and exchange, so the C version is almost as complex as the assembly one.
Before you write assembly code, always think, “Do I really need to do this?” If you do, then go ahead, and good luck!