SKIP THE SHIPPING
Use code NOSHIP during checkout to save 40% on eligible eBooks, now through January 5. Shop now.
Register your product to gain access to bonus material or receive a coupon.
Master x86 assembly language from the Linux point of view! Assembly language lies at the boundary between hardware and software. While it can be described purely in terms of how it controls hardware, many of its features only make sense in the context of operating systems and programming languages. In Linux Assembly Language Programming, Bob Neveln explains all the key features of x86 assembly language in the context of the Linux operating system and the C language. The book's step-by-step, one-concept-at-a-time coverage is designed to help experienced hardware programmers move to Linux, and learn how to create Linux device drivers. As developers learn new x86 assembly language skills, they also gain "under the hood" insight into how Linux works -- and into the way processor and software design impact each other. For C programmers who want to understand more about the interactions between Linux and hardware; and for assembler programmers who want to apply their skills in the Linux environment.
1. Introduction.
The Fetch-Execute Cycle. The Linux Operating System. The Gnu C Compiler. The Edlinas Assembler. NASM. Other Assemblers.
The Decimal and Pentimal Systems. Pentimal Arithmetic. Conversion to Pentimal. The Binary System. Memory as a Rectangle of Bits. The Hexadecimal System. Base Distinguishing Notations. Fractions in Other Bases. Converting Fractions.
The NOT Gate. Boolean Operators. Logic Gates. Addition Circuits. Sequential Circuits. Negative Number Representation. Subtraction Using Negation. Placeholding Two's Complement. Memory Circuits. x86 General Registers and their Ancestry. The MOV Command. Addition and Subtraction Commands. Multiplication and Division Commands.
The Four Field Format. Computers from the CPU Standpoint. Simple Assembly Language Programs. Assembler Programs with Jumps. Assembler Programs with Loops. Signed Comparisons. Unsigned Comparisons. Linux .s files.
Assembling Simple Programs. Opcode Space. The ModRM Byte. 386 Space (0F + ...). 32-Bit vs 16-Bit Code. The 8-Bit Registers. Linux .o Files.
4-Byte Data Width. Addresses in Brackets. Operand Size Ambiguity. Labels. Immediate Storage.
Push and Pop Operations. Subprograms. Parameter Passing. Recursion.
Multitasking. Paging. Address Translation. Program Segments. Other Data Segments. Protection. Executable Files in ELF Format. Object Files in ELF Format.
Polling. External Interrupts. ISA Architecture. Internal and Software Interrupts. System Calls. Privilege Levels. Control Transfer. Scheduling.
Bitwise Logic Operations. The AND, OR, NOT, and XOR Commands. Bit Setting and Testing. Shift Instructions.
Device-Independent Files. Devices as Files. Morse Code Speaker Driver. Serial Port Digitizer Driver.
Real Mode Segmentation. Edlinas Environment Variables. Fixed Memory Areas. Real Mode Interrupts. Checking DOS Memory.
Changing to Protected Mode. Protected Mode Segmentation. Setting Up the Global Descriptor Table. Closing.
Assembly language is language which gives the programmer direct control over the computer. That is what appeals to people about assembly language. It is like using a stick shift. Programming with other languages, high-level languages, is like using an automatic.
Many people who use computers simply run programs. To them a program is a canned software package. People who like to write programs like to be able to shape the behavior of the machine the way metalsmiths shape metal into useful mechanical tools. Amongst all the programs on a computer, there is one program which runs the machine: the operating system. It controls everything. It offers "services" to the other programs. Most operating systems force programmers to leave their programming skills behind as they approach the operating system and to use it as they would a canned software package. That is because its source code is a secret. Linux portends the end of secret code in computing. Because the Linux source code and a compiler for it are right there on the computer along with the other source code, it allows programmers to work with the operating system as they do with programs they have written.
Operating systems were once written by programmers employed by computer manufacturers. Revolutions in hardware produced corresponding revolutions in the software. When Linus Torvalds rewrote Linux so that it would run on the Alpha architecture, his goal was not to increase its hardware base from one platform to two, but to make Linux platform-independent. The subsequent ports of Linux, to everything from a Sparc to a PowerPC, demonstrate the success of his rewrite.
The chief value of it is that it provides us with confidence that Linux is here to stay. We don't have to fear a PowerPC revolution coming along and forcing us to dump all of our old software.
Assembly code, on the other hand, is intrinsically platform-dependent and is justifiably regarded with caution for just this reason. It will have to be redone when the next hardware revolution takes place. Furthermore, people who compare the machine language of the 386 with other machine languages, both real and ideal, inevitably end up regarding the 386 language as a historical accident. On the other hand, the genetic code is sometimes referred to as a frozen accident. The term is based on the idea that the genetic code ceased its evolution when the number of proteins whose code would be "broken" by a mutation in the genetic code became so large that such mutations became lethal, and so the code became fixed. It remains to be seen whether 386 machine code has been "frozen" into place by the size of its software base. The threat of a PowerPC revolution has passed. On the other hand, many Linux enthusiasts anticipate an Alpha revolution.
But the Alpha revolution has not happened and it may not happen. The 386 language has been around for a long time. With many RISC machines now emulating the 386 architecture, isn't it time to consider programming in 386 assembly language? Assembly language is more work but it has its advantages. A very nice feature of assembly language code, which it shares with Linux itself incidentally, is that from a crass performance standpoint, it functions beautifully. Relying on compilers to produce good code is usually justifiable as a time saving measure.
But to get the best possible code, there is still no better option than to use assembly language. When high-level languages were still a novelty and referred to as automatic programming, many programmers were greatly offended by them. They were convinced that no compiler program could write code as well as they could. They were right of course. Compilers produce cheaper code but not better code. To get the full measure of speed and grace that a machine is capable of, there is no substitute for assembly language.
Furthermore, even if the Alpha revolution arrives on schedule tomorrow, there will remain in the world millions of processors running a 386 language that work beautifully and need to be put to a socially responsible use.
Computers can be programmed to report on our buying habits or to send off nuclear missiles. But they can also be programmed to communicate with privacy or to support medical research. As siliconsmiths, our job is to shape the behavior of the machine towards a human agenda.
This book assumes that the reader has some knowledge of C, but it makes no other assumptions.
Starred sections of the book are not needed subsequently and may be skipped when they are not of intrinsic interest.
Many errors undoubtedly remain. To those readers who notify me of them at neveln@cs.widener.edu
I shall be grateful.