- Choosing a Chart Type
- Understanding a Date-Based Axis Versus a Category-Based Axis
- Using a Chart to Communicate Effectively
- Adding an Automatic Trendline to a Chart
- Showing a Trend of Monthly Sales and Year-to-Date Sales
- Understanding the Shortcomings of Stacked Column Charts
- Shortcomings of Showing Many Trends on a Single Chart
- Using a Scatter Plot to Show a Trend
Understanding a Date-Based Axis Versus a Category-Based Axis
Excel offers two types of horizontal axes in a trend chart. Having the proper setting can ensure that your message is accurate.
If the spacing of events along the time axis is uniform, it does not matter whether you choose a date-based axis or a text-based axis. The results will be the same. In this case, it is fine to allow Excel to automatically choose the type of axis.
However, if the spacing of events along the time axis is haphazard, you definitely want to make sure that Excel is using a date-based axis.
Usually, if your data contains dates, Excel defaults to a date-based axis. However, you should explicitly check to make sure that Excel is using the correct type of axis. A number of potential problems force Excel to choose a text-based axis instead of a date-based axis, such as dates that are stored as text in a spreadsheet and dates represented by numeric years. (See the list following Figure 3.7 for other potential problems.)
Figure 3.7 You can explicitly choose an axis type rather than letting Excel choose the default.
To explicitly choose an axis type, you follow these steps:
- Right-click the horizontal axis and choose Format Axis.
- In the Format Axis dialog box that appears, choose the Axis Options category.
- Choose either Text Axis or Date Axis, as appropriate, from the Axis Type section (see Figure 3.7).
A number of complications that require special handling can occur with your date fields. The following are some of the problems you might encounter:
- Dates stored as text—If your dates are stored as text dates instead of real dates, a date-based axis will never work. You have to use date functions to convert the text dates to real dates.
- Dates represented by numeric years—All your trend charts may have category values of 2005, 2006, 2007, and so on. Excel doesn't naturally recognize these as dates, but you can trick it into doing so.
- Dates before 1900—If your company has been around long enough that you are charting historical trends before January 1, 1900, you are sunk. In Excel's world, there are no dates before this time period.
- Dates that are really time—It is not difficult to imagine charts in which the horizontal axis contains periodic times throughout a day. You might want to use such a chart to show the number of people entering a bank. For such a chart, you need a time-based axis, but Excel will group all of the times from a single day into a single point. See "Using a Workaround to Display a Time-Scale Axis" for the rather complex steps needed to plot data by periods smaller than a day.
- Dates that you need to appear as text in order to draw in a decorative element—The case study, "Using a Decorative Element in a Chart," later in this chapter, shows a chart by designer Kyle Fletcher in which the dates are forced to be text.
Each of these situations is discussed in the following sections.
Converting Text Dates to Dates
If your cells contain text that looks like dates, the date-based axis will not work. In Figure 3.8, the data came from a legacy computer system. Each date was imported as text instead of as dates.
Figure 3.8 These dates are really text, as indicated by the apostrophe before the date in the formula bar.
This is a frustrating problem because text dates look exactly like real dates. You may not notice that they are text dates until you see that changing the axis to a date-based axis has no effect on the axis spacing.
If you select a cell that looks like a date cell, look in the formula bar, and see an apostrophe before the date, you know you have text dates (refer to Figure 3.8). This is Excel's arcane code to indicate that a date or number should be stored as text instead of a number.
Figure 3.9 Many groups on the ribbon have this tiny More icon in the lower-right corner. Clicking this icon leads to the legacy dialog box.
Understanding How Excel Stores Dates and Time
On a Windows PC, Excel stores dates as the number of days since January 1, 1900. For a date such as 9/15/2007, Excel actually stores the value 39,340, but it formats the date to show you a value such as 09/15/2007.
On a Mac running Mac OS, Excel stores the dates as the number of days since January 1, 1904. The original designers of the Mac OS were trying to squeeze the OS into 64K of ROM. Since every byte mattered, it seemed unnecessary to add a couple lines of code to handle the fact that 1900 is not a leap year. Excel for the Mac adopted the 1904 convention. Excel for Windows, which needed to be compatible with Lotus 1-2-3, adopted the 1900 convention. As you will read in the next case study, the 1900 convention incorrectly made 1900 a leap year.
Excel provides a complete complement of functions to deal with dates, including functions that convert data from text to dates and back.
Excel stores times as decimal fractions of days. For example, you can enter noon today as =TODAY()+0.5. You can enter 9 a.m. as =TODAY()+0.375. Again, the number format handles converting the decimals to the appropriate display.
Converting Text Dates to Real Dates
The DATEVALUE function converts text that looks like a date into the equivalent serial number. You can then use the Format Cells dialog to display the number as a date.
The text version of a date can take a number of different formats. Say that your international date settings call for a month/day/year arrangement of the dates; Figure 3.10 shows a number of valid text formats that can be converted with the DATEVALUE function.
Figure 3.10 The DATEVALUE function can handle any of the date formats in column J.
After using the DATEVALUE function, you need to format the result as a date in order to display the numbers as dates.
Figure 3.11 shows a column of text dates. You follow these steps to convert them to real dates:
Figure 3.11 You need to convert these text dates to real dates before creating a chart.
- Insert a blank column B by selecting cell B1 and then choosing Insert, Insert Sheet Columns on the Home ribbon. (Alternatively, use the Excel 2003 shortcut Alt+I+C.)
- In cell B2, enter the formula =DATEVALUE(A2). Excel displays a number in the 39,000 range in cell B2. You are halfway to the result (see Figure 3.12).
Figure 3.12 The result of the DATEVALUE function is a serial number. You still have to format the result as a date.
- Select cell B2. On the Home ribbon, select the drop-down at the top of the Number group and choose either Short Date or Long Date. Excel displays the number in cell B2 as a date (see Figure 3.13). (Alternatively, press Ctrl+1 and select any date format from the Number tab.)
Figure 3.13 You can choose a date format from the Number drop-down on the Home ribbon.
- Double-click the fill handle in the lower-right corner of cell B2. (The fill handle is the square dot in the lower-right corner of the active cell indicator.) Excel copies the formula from cell B2 down to your range of dates.
- If some of the dates appear as #######, you need to make the column wider. To do so, double-click the border between the column B and column C headings.
- To convert the live formulas in column B to be static values, while the range of dates in column B is selected, press Ctrl+C to copy and then on the Home ribbon, select Paste, Paste Values to convert the formulas to values. (Alternatively, you could use this shortcut: Right-click the right border of the selected range and hold the mouse button down as you drag right one column and then back to the original location. When you release the mouse button, choose Copy Here as Values Only from the contextual menu.)
- Delete the original column A.
After you have converted the text dates to real dates, you can insert a line chart with markers. Excel automatically formats the chart with a date-based axis. In Figure 3.14, the top chart reflects cells that contain text dates. The bottom chart uses cells in which the text dates have been converted to numeric dates.
Figure 3.14 When your original data contains real dates, Excel automatically chooses a more accurate date-based axis. The bottom chart reflects a date-based axis.
Converting Bizarre Text Dates to Real Dates
When you rely on others for your source data, you are likely to encounter dates in all sorts of bizarre formats. While gathering data for this book, for example, I found a dataset where each date was listed as a range of dates. Each date was in the format 2/4-6/06, indicating that the data was collected from February 4 through the 6, 2006.
A number of functions, used in combination, can be useful when you're converting strange text dates to real dates:
- =DATE(2006,12,31)—This returns the serial number for December 31, 2006.
- =LEFT(A1,2)—This returns the two leftmost characters from cell A1.
- =RIGHT(A1,2)—This returns the two rightmost characters from cell A1.
- =MID(A1,3,2)—This returns the third and fourth characters from cell A2. (You read the function as "return the middle characters from A1, starting at character position 3, for a length of 2.")
- =FIND("/",A1)—This finds the position number of the first slash within A1.
You follow these steps to convert the text date ranges shown in Figure 3.15 to real dates:
- Because the year is always the two rightmost characters in column A, in cell B2, enter the formula =RIGHT(A2,2).
- Because the month is the leftmost one or two characters in column A, ask Excel to find the first slash and then return the characters to the left of the slash. Enter =FIND("/",A2) to indicate that the slash is in second character position. Use =LEFT(A2,FIND("/",A2) to get the proper month number.
- For the day, you can either choose to extract the first or last date of the range. To extract the first date, ask for the middle characters, starting one position after the slash. The logic to figure out if you then need one or two characters is a bit more complicated. You need to find the position of the dash, subtract the position of the slash, and then subtract 1. Therefore, use this formula in cell D2:
=MID(A2,FIND("/",A2)+1,FIND("-",A2)-FIND("/",A2)-1)
- Use the DATE function, as follows, in cell E2 to produce an actual date:
=DATE(B2,C2,D2)
Figure 3.15 A mix of LEFT, RIGHT, MID, and FIND functions parse this text to be used in the DATE function.
Dates Not Recognized as Dates: Numeric Years
If you are plotting data where the only identifier is a numeric year, Excel does not automatically recognize this field as a date field.
In Figure 3.16, for example, data is plotted once a decade for the past 50 years and then yearly for the past decade. Column A contains four-digit years, such as 1955, 1965, and so on. The default chart shown in the top of the figure does not create a date-based axis. You know this to be true because the distance from 1955 to 1965 is the same as the distance from 1995 to 1996.
Figure 3.16 Excel does not recognize years as dates.
There are two solutions to this problem:
- Convert the years in column A to dates by using =YEAR(A2,12,31). Format the resulting value with a yyyy custom number format. Excel then displays 2005 but actually stores the serial number for December 31, 2005.
- Convert the horizontal axis to a date-based axis. Excel then thinks that your chart is plotting daily dates from May 8, 1905, through June 27, 1905. Because no date format has been applied to the cells, they show up as the serial numbers 1955 through 2005. Excel displays the chart properly, even though the settings show that the base units are days.
Dates Not Recognized as Dates: Dates Before 1900
In Excel 2007, dates from January 1, 1900, through December 31, 9999, are recognized as valid dates. If you happen to be a company that was founded more than a demisesquicentennial before Microsoft was founded, however, you will potentially have company history going back before 1900.
Figure 3.17 shows a dataset stretching from 1787 through 1959. The accompanying chart would lead the reader to believe that the number of states in the United States grew at a constant rate, a statement that would cause Mr. Kessel, my eighth-grade geography teacher, to give me an F for this book.
Figure 3.17 Dates from before 1900 are not valid Excel dates. A date-based axis is not possible in this case.
Formatting the chart to have a date-based axis does not work because Excel does not recognize dates before 1900 as valid dates. The next two sections discuss possible workarounds.
Using a Date-Based Axis with Dates Before 1900 Spanning Less Than 100 Years
In Figure 3.18, the dates in column A are text dates from the 1800s. Excel cannot automatically deal with dates from the 1800s, but it can deal with dates from the 1900s.
Figure 3.18 Transforming the 1800s dates to 1900s dates and using clever formatting allows Excel to plot this data with a date axis.
One solution is to transform the dates to be dates in the valid range of dates that Excel can recognize. You can use a date format with two years and a good title on the chart to explain that the dates are from the 1800s.
To create the chart in Figure 3.18, you follow these steps:
- Insert a blank column B to hold the transformed dates.
- Enter the formula =DATE(100+RIGHT(A4,4),LEFT(A4,2),MID(A4,4,2)) in cell B4. This formula converts the 1836 date to a 1936 date.
- Select cell B4. Press Ctrl+1 to open the Format Cells dialog. Choose the date format 3/14/01 from the Date category on the Number tab. This formats the 1936 date as 6/15/36. (You will later add a title to indicate that the dates are in the 1800s.)
- Double-click the fill handle in cell B4 to copy the formula down to all cells.
- Select the range B3:C17.
- From the Insert ribbon, choose Charts, Line, 2-D Line, Line.
- From the Layout ribbon, choose Legend, No Legend.
- Right-click the vertical axis along the left side of the chart and choose Format Axis from the context menu.
- In the Format Axis dialog that appears, on the Axis Options page, choose the Fixed option button next to Minimum and enter a fixed value of 20.
- Without closing the Format Axis dialog, click the dates in the horizontal axis in the chart. Excel automatically switches to formatting the horizontal axis, and the settings in the Format Axis dialog redraw to show the settings for the horizontal axis. In the Axis Type section, choose Date Axis. Click Close to close the dialog box.
- From the Layout ribbon, choose Chart Title, Centered Overlay Title.
- Click the State Count title. Type the new title Westward Expansion<enter>During 1845-1875 Added 13<enter>New States to the Union. Click outside the title to exit Text Edit mode.
- Click the title once. You should have a solid selection rectangle around the title. On the Home ribbon, click the Decrease Font Size button. Click the Left Align button.
- Carefully click the border of the title. Drag it so it appears in the top-left corner of the chart.
- Select the dates in B4:B17. Press Ctrl+1 to access the Format Cells dialog. On the Number tab, click the Custom category. Type the custom number format 'yy. This changes the values shown along the horizontal axis from m/d/yy format to show a two-digit year preceded by an apostrophe.
The result is the chart shown in Figure 3.18. The reader may believe that the chart is showing dates in the 1800s, but Excel is actually showing dates in the 1900s.
This method fails when you are trying to display more than 100 years of data points.
Using a Date-Based Axis with Dates Before 1900 Spanning More Than 100 Years
If you attempt to use the technique described in the preceding section on a chart that contains more than 100 years' worth of dates, the technique will fail.
Microsoft Excel 2007 doesn't do well with large datasets that span 100+ years. While I managed to create a date-based axis covering 630 years with 10 data points, a dataset covering 102 years and 40 points cannot display a date-based axis. As Figure 3.19 shows, however, it is possible to create this chart. To do so, you must transform your date axis into a scale that shows months, hide the axis, and add your own axis, using text boxes. These steps are not for the faint of heart.
Figure 3.19 This chart appears to show a date-based axis that spans 200+ years.
You first need to transform the dates from the 1800s to the 1900s. You then transform the dates spanning 172 years into a range where each month in real time is represented by a single day. This results in a time span of six years. You then need to use care to completely hide the labels along the horizontal axis and to replace them with text boxes showing the centuries. You then add a new data series to draw vertical lines at the change of each century.
To create the chart in Figure 3.19, you follow these steps:
- Insert new columns B and C.
- In cell B4, enter the formula =DATE(113+RIGHT(A4,4),LEFT(A4,2),MID(A4,4,2)). This transforms the dates from 1787 to a valid Excel date in 1900. Format this cell with a short date format.
- In cell C4, type the formula =(YEAR(B4)-1899)*12+MONTH(B4) to calculate a number of months. Format this cell as a short date. This formula now reduces 172 years into 172x12 into 2,064 days, where each day represents 1 month of real time.
- Select cells B4:C4 and double-click the fill handle to copy the formula down to your range of data. The dates in column B span 1900 to 2072. The dates in column D span 1900 to 1907. Although the relative position of the data points is correct, you have to hide the axis labels that Excel draws in for the horizontal axis. It would therefore be helpful to draw in vertical lines to show where the axis switches from the 1700s to the 1800s and another line to show where the axis switches from the 1800s to the 1900s.
- Insert a new column E to hold the data for the second series. This series contains just two nonzero points: one at 1800 and one at 1900. Enter the heading Divide Line in cell E3.
- Look through the dates in column A. Insert a new row before the first date in the 1800s. In this new row, enter 01/01/1800 in column A. Copy the formulas in columns B and C. In column D, copy the point from the row above. In column E, enter the value 50. This draws a single vertical bar from the horizontal axis up to a height of 50.
- Repeat step 6 to add a new data point for January 1, 1900, and January 1, 2000.
- Select C4:E55.
- From the Insert ribbon, choose Charts, Line, Line.
- On the Layout ribbon, choose Legend, None.
- Right-click the numbers along the vertical axis and choose Format Axis. Change the Maximum option button to Fixed and enter the value 50. This changes the vertical axis to show from 0 to 50.
- On the Layout ribbon, use the Current Selection drop-down to select Series. There are now only two data points selected in the chart.
- On the Design ribbon, choose Change Chart Type. Select the first icon in the column section—for a clustered column chart. This draws narrow columns—actually lines—at 1800 and 1900 on the chart. Note that the chart type change affects only the second series because you selected the Divide Line series in step 12.
- Click the labels along the horizontal axis. These labels show wrong dates such as 1/23/02. On the Home ribbon, from the Font Color drop-down choose a white font. This causes the axis labels to disappear.
- On the Insert ribbon, click the Text Box icon. On the chart, draw a text box from the 1800 line to the 1900 line, just below the horizontal axis. The mouse pointer changes into a crosshairs as you draw. You can make sure the vertical line in the crosshairs corresponds to the vertical dividing lines. After you create the text box, a flashing cursor appears inside the text box.
- Type 1800s. Click the edge of the text box to change it from a dashed line to a solid line.
- While the text box is selected, choose Center Align from the Home ribbon. Choose Vertical Center Align. Choose Increase Font Size from the Home ribbon.
- While the text box is still selected, choose Format, Shape Outline, Black on the Layout ribbon in order to outline the text box.
- Click the text box and start to drag to the right. After you start to drag, hold down the Shift key to constrain the movement to the right. Hold down the Ctrl key to make an identical copy of the text box. When the left edge of the new text box is aligned with the vertical line at 1900, release the mouse button. (Note that you must start dragging before you hold down the Ctrl+Shift keys. Microsoft interprets Ctrl+Click as the shortcut to select an object's container.)
- Click in the text box and change the text from 1800s to 1900s.
- On the Layout ribbon, choose Chart Title, Centered Overlay Title. The title Chart Title appears, and it is selected.
- Click inside the Chart Title text area to enter Text Entry mode. Overwrite the default text in the title by typing Growth of USA, pressing Enter, typing by # of States, pressing Enter, and typing 1787-1999.
- Click on the border of the chart title to exit Text Entry mode.
- Drag the chart title to a new location in the lower-right corner of the chart.
The result is a chart that appears to show a line chart that spans 217 years. The line is appropriately scaled, using a date-based axis.
Using a Workaround to Display a Time-Scale Axis
The developers who create Microsoft Excel are careful in the Format Axis dialog box to call the option a date axis. The technical writers who write Excel Help refer to a time-scale axis. The developers get a point here for accuracy because Excel absolutely cannot natively handle an axis that is based on time.
The data in Figure 3.20 is used to analyze queuing times. In column A, it logs the time that customers entered a busy bank. Times range from when the bank opened at 10 a.m. until the bank closed at 4:00 p.m.
Figure 3.20 Excel cannot show a time-series axis that contains times.
After you enter planned staffing levels in column C, the model calculates when the customer will move from the queue to an open teller window and when he or she will leave, based on an average of three minutes per transaction.
Data in columns I:M record the number of people in the bank every time someone enters or leaves. This data is definitely not spaced equally. Only a few customers arrive in the 10:00 hour, while many customers enter the bank during the lunch hour.
The top chart in Figure 3.20 plots the number of customers on a text-based axis. Because each customer arrival or departure merits a new point, the one hour from noon until 1 p.m. takes up 41% of the horizontal width of the chart. In reality, this one-hour period merits only 16% of the chart. It sounds like a perfect use for a time-series axis, right?
The bottom chart is an identical chart where the axis is converted to show the data on a date-based axis. This is a complete disaster. In a date-based axis, all time information is discarded. The entire set of 300 points is plotted in a single vertical line.
The solution to this problem involves converting the hours to a different time scale (similar to the 1800s date example in the preceding section). Perhaps each hour could be represented by a single year. The 10:00 hour could be represented by 2010, and the 3:00 hour could be represented by 2015 (because 3:00 is the 15:00 hour on a 24-hour clock).
In this example, you manipulate the labels along the vertical axis using a clever custom number format. A few new settings on the Format Axis dialog ensure that an axis label appears every hour.
You follow these steps to create a chart that appears to have a time-based axis:
- In cell L2, enter the following formula to translate the time to a date:
=ROUND(DATE(HOUR(I2)+2000,1,1)+MINUTE(I2)/60*364,0)
Because each hour will represent a single year, the years argument of the DATE function is =HOUR(I2)+2000. This returns values from 2010 through 2013. The other arguments in the date function are 1 and 1 to return January 1 of the year. Outside the date function, the minute of the time cell is scaled up to show a value from 1 to 365, using MINUTE(I2)/60*36. The entire formula is rounded to the nearest integer because Excel would normally ignore any time values. - Select cell L2. Double-click the fill handle to copy this formula down to all the data points. The results of this formula ranges from January 1, 2010 (representing the customer who walked in at 10 a.m.), to 12/25/2015 (representing the customer who walked in at 3:57 p.m.).
- Select cells L1:M303.
- From the Insert ribbon, choose Charts, Line, Line with Markers.
- On the Layout ribbon, choose Legend, None. (I hope that after studying Software Quality Metrics (SQM) data for Excel 2007, Microsoft finally realizes that 500 million people instantly turn off the legend in every chart that has a single data series.)
- Right-click the labels along the horizontal axis and choose Format Axis to display the Format Axis dialog box, where you make the following selections:
- In the Axis Type section, choose Date Axis.
- For Major Unit, choose Fixed, 1 Years.
- For Minor Unit, choose Fixed, 1 Days.
- For Base Unit, choose Fixed, Days.
- Return to the transformed dates in column L. Select L2:L303.
- Press Ctrl+1 to display the Format Cells dialog. On the Number tab, choose the Custom category. A custom number format of yy would display 10 for 2010 and 15 for 2015. Instead, use a custom number format of yy":00". This causes Excel to display 10:00 for 2010 and 15:00 for 2015. This is fairly sneaky, eh?
As you can see in Figure 3.21, the chart now allocates one-sixth of the horizontal axis to each hour. This is an improvement in accuracy over either of the charts in Figure 3.20. The additional chart in Figure 3.21 uses a similar methodology to show the wait time for each customer who enters the bank. If my bank offered 12-minute wait times, I would be finding a new bank.
Figure 3.21 These charts show the number of customers in the bank and their expected wait times.
Converting Dates to Text to Add a Decorative Chart Element
There are times when you want to force the category axis to be a text-based axis so that you have a bit more control.
In the following case study, professional designer Kyle Fletcher used Adobe Illustrator to produce a chart for a music industry publication. Kyle needed to separate sales before and after a specific event. Using Illustrator, Kyle drew his chart and created a gap between the pre-event and post-event columns. In order to replicate this effect in Excel, you need to use a text-based axis.