The Web Bestiary
This section contains a lot of acronyms and definitions. Much of the descriptive material is taken from Wikipedia. In a very real sense, Wikipedia represents the current usage and understanding of these terms by the Web community. I've listed them in order of decreasing importance, or their likelihood of ever coming up in casual conversation. This list is by no means complete.
- HTML (HyperText Markup Language) The predominant markup language for web pages. It provides a means to create structured documents using semantic tags for such things as headings, paragraphs, lists, links, quotes, and other items. It lets you embed images and other media objects and can be used to create interactive forms.
- CSS (Cascading Style Sheets) The language for describing the presentation (that is, the formatting and layout) of an HTML document. CSS is designed to enable the separation of document content from the details of how it should be presented, including the typography, positioning, colors, and margins. This separation improves content accessibility and provides more flexibility in controlling presentation characteristics.
- JavaScript An object-oriented scripting language. Although JavaScript has other uses, we are concerned here about client-side JavaScript—the version that runs inside a user's browser and manipulates HTML page elements. JavaScript code can be embedded within the HTML elements of a web page or imported from a separate file. Not all web pages have JavaScript components, and users can turn off their browsers' JavaScript engine if they want to. Robots generally ignore JavaScript code as they examine web pages.
- HTTP (HyperText Transport Protocol) The set of rules governing how user agents, web browsers, and the like send requests to a web server and how the web server responds to the request. The web server returns a status code and data, or sometimes just the status code, when something goes wrong. The familiar 404 error code is returned when the web server cannot find what you are looking for. There are two primary HTTP request methods. A Get request is typically sent by your browser when you click a link with the intention of going to another web page. A Post request is typically sent when you click a form's submit button, essentially asking that the web server do something with your input.
- CGI (Common Gateway Interface) A protocol for dynamically generating web pages in response to a get request or form submission. The term is typically used as an adjective to indicate a server-side process, such as CGI script. CGI programs are typically written using a scripting language such as Perl, Ruby, C, vBasic, or Python. Many websites are entirely driven by CGI processes, although the relative number of such sites has probably been declining as newer technologies, such as AJAX and PHP, have become popular.
- AJAX (Asynchronous JavaScript and XML) The most recent versions of JavaScript and other client-side scripting languages contain features that a developer can use to create web pages that can make independent HTTP requests to the server while the page is loading or anytime thereafter. AJAX is the set of techniques used to create web pages with elements that can be independently updated with new content in response to a user's mouse click or some other event without having to reload the entire page. This is how many widgets work.
- XML (eXtensible Markup Language) A set of rules for marking up documents that emphasizes generality and global usability. It is widely used to transmit arbitrarily structured data in mixed client/server environments. XML and HTML are compatible members of a family of markup languages called Standard Generalized Markup Language (SGML). HTML is an SGML language with a specific Document Object Model (DOM) focused on describing hypertext documents. The two technologies are combined in the XHTML specification.
- JSON (JavaScript Object Notation) Although based on JavaScript, JSON is a language-independent system for representing data objects. It is simpler than XML and is often used as an alternative to XML in AJAX applications to transfer data objects between a server and a script running in a user's browser.
- CMS (Content Management System) An application program or a package of software tools that facilitates the creation of web pages and automates their maintenance using a Web-based interface for authoring, editing, and administration. The term has broader use beyond the Web. For our purposes, it refers to any site or software that generates web pages from content stored in a database and provides a means of creating, editing, and managing that content without requiring knowledge of HTML, CSS, or FTP. A good CMS permits you to directly enter HTML with the content for finer control of web page presentation. Blogs are a form of content management system.
- Flash (Adobe Flash, formerly Macromedia Flash) A popular multimedia platform for adding animation and interactivity to web pages. Flash is commonly used to create animations, advertisements, and various interactive components, to integrate video into web pages and to develop rich Internet applications. Some websites are done entirely in Flash. However, this is now considered a poor practice, partly because the content of a Flash site is generally inaccessible to robots.
- PHP (PHP Hypertext Preprocessor) PHP originally stood for Personal Home Page. The PHP Group, the informal organization that currently oversees the development of the language, decided to expand the meaning of PHP a few years ago and gave us the current recursive acronym. PHP is a server-side technology for dynamically generating websites. It is powerful and easy to write but often difficult to read. A PHP file intermixes program logic—PHP statements enclosed in special tags—with HTML markup. When a request is sent to a web server for a file ending with the .php extension, the web server preprocesses the coded file, executes the PHP instructions, and returns an HTML document to the user's browser. Many modern Web applications, such as the popular blogging software WordPress, are written in PHP.
- FTP (File Transfer Protocol) An Internet protocol for transferring files from one computer to another, usually using a stand-alone application. Web browsers and page editors also use FTP to upload and download files. Dozens of FTP clients are available. One of the most popular is FileZilla, a free, open-source program that runs on Windows, Macintosh, and UNIX computers.
- jQuery (JavaScript Query Language) A library of JavaScript functions (often called a framework) that simplifies the development of dynamic, interactive web pages. It provides a language for selecting DOM elements and giving them complex behaviors. jQuery takes care of cross-browser differences in the DOM and facilitates the use of AJAX. In much the same way that CSS does with web page presentation, jQuery encourages the separation of semantic HTML markup from the descriptions of how HTML elements should respond to events. jQuery makes Web programming fun.
- RSS (Real Simple Syndication) An XML protocol for distributing content. Such distributed content from a website is called a feed and provides an alternative means for users to access the content. Users can subscribe to feeds using a number of stand-alone newsreaders or by using the feed-reading facilities incorporated into their browsers and email clients. Feeds from one website can also be embedded into web pages on another site in a syndicated publishing model. RSS is quite popular but evolved in an ad hoc way and is not a recognized standard. A newer feed protocol called Atom is more robust and follows the applicable standards.
- DNS (Domain Name System) A system for assigning names to computers connected to the Internet or a private network. It translates domain names meaningful to humans into the numerical addresses associated with networking equipment for the purpose of locating these devices worldwide. The Domain Name System can be thought of as the "phone book" for the Internet.
DOM (Document Object Model) A dictionary and grammar for interpreting HTML. A DOM describes HTML elements and their attributes and properties and how they are used to create web pages. DOMs are published in a form that can be read by both humans and machines. Every web browser has at least one DOM, and most modern browsers conform to DOMs published by the W3C. Yet there are still some differences in browser behavior arising from coding bugs, DOM misinterpretation, and edge conditions where browser behavior is not fully defined.
In this book, whenever you encounter the term DOM, it means the W3C's draft specification for HTML5 as interpreted by your favorite browser. Your browser may or may not support this or that new HTML5 element when you experiment with the examples given. The same is true of any particular editing tool or environment you like to use. My aim is to present HTML that works reliably across all modern browsers and is pleasing to all user agents.