- Turning Raw Data into Information
- Examining Relationships with Scatterplots
- Understanding the Types of Variables
- Common Questions About Research Data
Understanding the Types of Variables
The first step, before any calculations or plotting of data, is to decide what type of data variable and variables you're working with. There are a number of typologies but one that has proven useful is provided in the following table. The basic distinction is between the type of variables: quantitative variables (for which you ask "how much?") and categorical variables (for which you ask "what type?").
Quantitative variables can be continuous or discrete. In theory, continuous variables such as weight can take any value within a given range. Discrete variables have a limit and can be counted or observed directly. Examples of discrete variables include number of employees in a department within the company, number of boxes of batteries sold, or the number of different models of a product.
Categorical variables are either nominal (unordered) or ordinal (ordered). You learned about this difference when you read about data types and variables in Chapter 2, "Understanding and Organizing Business Data." Some simple examples of nominal variables are male/female, alive/dead, regional area, and product style.
For nominal variables with more than two categories the order does not matter. For example, one cannot say that people in the Western regional sales group lie between those in the Southern regional sales group. There is no natural ranking among named regions. However, sometimes people can provide ordered responses such as grade of product quality; or they can agree, neither agree nor disagree, or disagree with some statement. In this case the order does matter and usually is important to account for.
Typology of Variables and Data
Quantitative Variables |
|
Continuous Data |
Discrete Data |
Height, weight, age |
Number of batteries sold |
Salary from $1 to infinity Number of product defects |
|
Categorical Variables |
|
Ordinal (Ordered categories) of Data |
Nominal (Unordered categories) of Data |
Product quality |
Gender (male/female) |
Better, same, worse |
Styles of Jaguar Cars (XJS, S Type, XJ8) |
Disagree, neutral, agree |
Sales Region (Western, Eastern, Southern, Northern, Midwestern) |
Variables shown at the left of the preceding table can be converted to those farther to the right by using cutoff points. For example, salary can be turned into a nominal variable by defining "high salary" as an annual salary of more than $200,000, "moderate salary" as less than or equal to $200,000 and more than $75,000, and "low salary" as less than or equal to $75,000. Height (continuous) can be converted into short, average, or tall (ordinal).
In general it is easier to summarize categorical variables; thus quantitative variables often are converted to categorical ones for descriptive purposes. However, categorizing a continuous variable reduces the amount of information available and statistical tests in general will be more sensitive; that is, they will have more power to predict outcomes or results for a continuous variable than the corresponding nominal one, although more assumptions might have to be made about the data.
Therefore, categorizing data often is useful for summarizing results, but not typically useful for statistical analysis. The choice of appropriate cutoff points can be difficult and different choices can lead to different conclusions about a set of data.
The definitions of types of data and variables in this section are not unique, nor are they mutually exclusive; they are provided to help you create or read a report that uses statistics, and to decide how to display and analyze the data. You should never debate too long about the typology of a particular variable in your analysis!