Debunking the Myth of High-level Languages
- What Is a High-Level Language?
- Portable Abstractions
- Virtual Machines and Other Overhead
- The Wrong Abstraction
- The Process of Programming
The closer to the metal you can get while programming, the faster your program will compile — or so conventional wisdom would have you believe. In this article, I will show you how high-level languages like Java aren't slow by nature, and in fact low level languages may compile less efficiently.
What Is a ’High-Level’ Language?
A computer language is a way of representing a program that can be translated into something that a computer can execute. A language is described as low-level if it’s close to instructions executed by the hardware.
The lowest-level language that can be used is raw machine code—a string of numbers representing instructions and operands understood by the CPU. Note that in most modern microprocessors this is still one layer of abstraction away from the "real" instructions. A modern x86 CPU, for example, will split each of these instructions into a series of micro-operations (μOps) and then execute the μOps individually.
The next layer up is assembly languages, which are semantically equivalent to the raw machine code; one assembly language statement translates directly to one machine instruction. Assembly languages are slightly easier to read than machine code, because they substitute mnemonics for numbers. They often have some syntactic sugar, such as the ability to define macros—code segments that are reused frequently—and insert them by name. They also have the ability to define jump targets symbolically, rather than having to change an address everywhere in your program when you insert an instruction before a jump.
Moving slightly further up, we get to languages like C. Early versions of UNIX were written in Assembler, but this proved to be a hindrance when porting it to new platforms, because assembly languages are machine-specific. Existing high-level languages, such as LISP, provided too much abstraction for implementing an operating system, so a new language was created. This language, C, was a very slightly abstracted form of PDP-11 assembly language. There is almost a 1:1 mapping between C semantics and PDP-11 machine code, making it very easy to compile C for the PDP-11 (the target machine of UNIX at the time).
For a long time, LISP was the archetypal high-level language. It provides a very flexible syntax in which many complex design patterns can be represented in a reusable fashion. LISP is generally categorized as a functional language, but this isn’t entirely accurate (although it does support functional programming language).
The definition of a high-level language is a moving target. Languages that were considered high-level when I learned to program are now considered low-level. In general, a programming language provides a midway point between how you think about a program and how a computer executes the program. Languages that are closer to you than to the computer are considered high-level, while others are considered low-level.