Logo: TechTrax...brought to you by MouseTrax Computing Solutions

Scatter Chart or Line Chart?

by Jon Peltier, MVP

One of the most confusing issues to an Excel charting newcomer is the distinction between Line charts and XY Scatter charts. The major difference between Line charts and XY Scatter charts is in their treatments of the category (X) data. A Line chart treats the input as non-numeric, categorical information, equally spaced along the X axis. This is appropriate for categorical data, such as text labels, but can produce unexpected results when the X values consist of numbers. A Scatter chart treats the X values as continuously variable numeric data.

Illustration of Excel's Line and Scatter Chart Types
Let's explore the differences between Line and Scatter charts with the following data. The X values in A2:A6 consist of the numbers 1 through 5, out of order. There is a label, Y, in the cell above the Y values, but only a blank cell above the X values. This blank cell indicates to Excel that we want to use the first column for X values and the first row for series names. This lightly documented use of blank cells is a very helpful feature, once one gets over the fear of not labeling a column.

   A B
1    Y
2 1 7
3 4 6
4 3 5
5 2 3
6 5 4


Line Chart—Category Axis
Select the range A1:B6, including the top row with the blank cell, start the Chart Wizard by selecting Chart from the Insert menu, or by pressing this button , and create a Line chart:

Notice the category labels across the X axis. Though apparently numeric, they are not listed left to right in numerical order, but instead in the order they appear in the worksheet. The tick marks and labels are evenly spaced, also regardless of any numerical values the labels may have. These labels are treated as non-numeric attributes, and we may as well have used attributes like Red, Orange, and Blue. You cannot change the limits of the X axis scale, either. You can reverse the order of the labels. You can also uncheck the Y Axis Crosses Between Categories option in the Format Axis dialog, to produce the following chart appearance in which the markers start and end at the extremes of the X axis:

Scatter Chart—Value Axis
Select the same range as above and create an XY Scatter chart:

The X axis is a true value axis, and each point is positioned horizontally according to its actual X value. The tick marks are uniformly spaced, as expected, but they appear in increasing numerical order (or decreasing order, if that's how we formatted the axis scale). You can adjust the minimum and maximum of this type of axis, as well as the major and minor tick mark spacing. There does not have to be a label for every plotted point, or a point plotted at every X value. The data points do not simply line up like mindless robots above the next available tick label position. The data are plotted directly above the appropriate value of the X axis, and the connecting lines zigzag as needed to connect the points.

Line Chart—Time Scale Axis
If the Category (X) Values are recognized by Excel, these are used in a special kind of axis in a Line chart. The following data is used to show a Time-Scale axis, which is only available in a Line chart.

   A B
1    Y
2 1-Jan 7
3 2-Jan 6
4 6-Jan 5
5 7-Jan 3
6 12-Jan 4

If this data were applied to a regular line chart, such as we created above, we would see the data uniformly spaced across the X axis, even though the points were separated by different lengths of time:

Fortunately, Excel recognizes the X values as dates, and automatically uses a Time-Scale category (X) axis, which spaces the points according to the elapsed time:

The Time-Scale axis is a very handy axis to use, because it allows more scaling flexibility than even an XY Scatter chart value axis. For example, you can place axis ticks at the first day of each month, and the ticks will be closer for a 30–day month than for a 31–day month.

Unfortunately, the Time-Scale axis behaves more like a Date-Scale axis. Excel treats date and time values as a whole number signifying the day plus a decimal fraction indicating the fraction of a day since midnight. The Time-Scale axis only interprets the whole number part of the X value, or the date. Any values during a given date are all placed at midnight at the beginning of that date. For example, the following data should produce a straight line, because the Y value decreases by 1 unit for every 16 hours that pass.

   A B
1    Y
2 1/1/2003 12:00:00 AM 7
3 1/1/2003 04:00:00 PM 6
4 1/2/2003 08:00:00 AM 5
5 1/2/2003 11:59:00 PM 4
6 1/3/2003 04:00:00 PM 3

This straight line shows up as expected in a Scatter chart:

However, when plotted on a Line chart, even with the convenient Time-Scale axis, we see how the data is improperly spaced:

Chart Types
Let's take a short tour through a few of Excel's chart types. You see the Chart Type dialog as the first step of the Chart Wizard, and you can also see the Chart Type dialog if you select a chart or series and choose Chart Type from the Chart menu. On the left of the Chart Type dialog is a list of Excel's chart types:

The Chart Type list

Notice a couple things about the Chart Type list:

  1. The Line chart type is listed third, while the XY (Scatter) chart type is listed fifth. I would have put Scatter higher on the list, because I think it is the more important of the two. With the arrangement above, someone unfamiliar with Excel's chart types is likely to stumble across the Line chart type first, when they probably want a Scatter chart.

  2. The design of the buttons for Line and XY (Scatter) types imply that a Line chart provides points with connecting lines, while a Scatter chart only provides scattered points.

A closer look at the chart sub-types shows that the implied formatting observed above is not strictly true. When a chart type is selected from the list, the available sub-types are shown to the right of the chart type list. The Line type provides sub-types with and without markers, as shown below. The data can be shown as is (left column), stacked (middle), and stacked to a total of 100%. The stacked types can be handy in specific instances, but you will likely stick with the unstacked types. You can even select a line chart with 3-D effects, where the series appear as ribbons on the chart; you should only use this chart sub-type if your intent is to distort your data and confuse the viewer.

The Line Chart sub-types

The XY Scatter sub-types are not as extensive (see below). You can choose markers only, markers with lines, and lines only. In addition, if you want connecting lines, you can select straight line segments between points, or smoothed lines.

The Scatter Chart sub-types

In fact, there are more extensive possibilities than the chart types and subtypes. Once you have created a chart, you can select each series and format it independently. In both Line and Scatter chart series the connecting lines can be straight, smoothed, or removed entirely. For both types of series, the marker shape can be selected from among the choices or it can be removed. In a Line chart, you can stack series values, because the category (X) values of the first series are used for all series in the chart; in a Scatter chart each series can have its own X values, so stacking is not supported. Line charts support some rudimentary 3-D effects; fortunately Scatter charts are not capable of such distortion.

Finding Out About—and Changing—Your Chart
What Type of Series is That?

There are two ways to find out what kind of series is plotted in your chart:

  • Select the series, choose Chart Type from the Chart menu, and see what choice is highlighted.

  • Right click on the series, choose Chart Type from the pop up menu, and see what is highlighted.

Note that you can have more than one type of series in the same chart.

What Type of Axis is That?
There are two ways to find out what kind of X-axis your chart has:

  • Select the axis, choose Selected Axis from the Format menu (or press Ctrl-1), and click on the Scale tab.

  • Right click on the axis, choose Format Axis from the pop up menu, and and click on the Scale tab.

The scaling options are dependent upon the axis type, which is indicated at the top of the dialog:

Value (X) axis scale — Category (X) axis scale — Time (X) axis scale

While only a Scatter chart can have a Value X axis, many charts can have a Category or Time X axis. These include Column, Line, Area, and Stock charts. Bar charts have a Category or Time X axis, oriented vertically along the left edge of the chart. A Surface or Contour chart has two Category axes.

How do I Change the Chart Type?
To change the chart type of an entire chart, select the chart and choose Chart Type from the Chart menu, or right click on the chart, and choose Chart Type from the pop up menu. Select one of the types and subtypes, and all series in the chart will change to that type.

To change the chart type of a single series, select the series and choose Chart Type from the Chart menu, or right click on the series, and choose Chart Type from the pop up menu. Select one of the types and subtypes, and just the selected series will change to that type.

How do I Change the Axis Type?
If you have a Line chart, you can easily switch your X axis between Category and Time Scale. Right click on the chart and choose Chart Options from the pop up menu, or select the chart and choose Chart Options from the Chart menu, click on the Axes tab, and check the option button for Category or Time-scale. You could also check the Automatic option button, but I usually override this, so Excel doesn't misread the data in the X values range.

Switching between a Value axis and a Category or Time scale axis requires you to change the chart type of one or more series, or even the entire chart. I find it easier to change the entire chart's type, then change selected series back.

Which Type of Chart Should You Use?
The following questions are examples of those asked by Excel users who have created a Line chart when they should have made an XY Scatter chart:

"Can I force scaling of the X-axis to be proportional?"

"How can I scale the X-axis like I can with the Y-axis?"

"I added a series, but the chart uses the X values from the first series. How do I make the chart use the X values from the new series?"

"Why can't I get a logarithmic scale on the X-Axis?"

If you require the precision of a Time-Scale axis, or if your data has non-numeric categories for the X values, you should use a Line Chart. 11=sFor almost every other application, you should use an XY Scatter Chart, with its Value axis. You can format the lines and markers of either type of chart identically, but you have greater flexibility using the Value axis of a Scatter chart. Data points are located along the X axis according to their X values, not the order they are listed in the worksheet.

If you need to compute a trendline for your data, you should not use a Line chart, because the regression analysis does not properly interpret the numerical values of the Line chart's category labels. If the X values are dates, without a time component, you can safely use a Line chart with a Time-Scale axis, but the best option is to use an XY Scatter chart. In addition, if you need a logarithmic scale on your X axis, you can only get it with a Value axis.

The table below summarizes the features and behaviors of Excel's Scatter and Line charts.

Comparison of Features of Scatter Charts and Line Charts

   Scatter Chart Line Chart
(Category Axis)
Line Chart
(Time Scale Axis)
Formats Supported Markers
Lines
Lines and Markers
Smoothed Lines
Smoothed Lines and Markers
Markers
Lines
Lines and Markers
Smoothed Lines
Smoothed Lines and Markers
Markers
Lines
Lines and Markers
Smoothed Lines
Smoothed Lines and Markers
Axis Value Variable Type Continuously Variable Numeric Values Non-Numeric Attributes (even if numbers are used) Whole Numbers (Days);
Fractions (Times) are Ignored
Axis Scale Limits Adjustable Fixed Adjustable
Logarithmic Scale Supported Not Supported Not Supported
Smoothed Lines Supported Supported Supported
Stacked Series Not Supported Supported Supported
Value (Y) Axis Crosses Between Categories Not Supported Supported Supported
Horizontal Spacing of Points Proportional to X Values Uniform Proportional to Date;
Times Clustered at Respective Date
Vertical Lines Possible Cannot Chart Vertical Lines Vertical Lines Possible
Independent of Order within Sheet
(no change if sheet sorted)
Left-to-Right per Order within Sheet
(changes if sheet sorted)
Independent of Order within Sheet
(no change if sheet sorted)
Lines Connect Points per Order within Sheet
(lines rearrange if sheet sorted)
Lines Connect Points per Order within Sheet
(Left-to-Right within Chart)
Lines Connect Points Left-to-Right within Chart
(no change if sheet sorted)
Trendlines Computes Y=f(X) Computes Y=f({1, 2, 3,...})
Not mathematically legitimate unless categories represent numbers that actually do vary by a uniform amount
Computes Y=f(INT(X))
Accurate for dates, wrong if times are truncated
Subsequent Series' Category Values Each Series has its own Independent X Values All Series Use Same X Values All Series Use Same X Values
Treatment of Multiple Column (or Row) Source Data Range First Column is used for X Values First Column Numeric: Each Column Used as Y Value; no X Values Used
First Column Categorical (Text): First Column Used as Category Values
First Column Dates: First Column Used as Date Values

 

Go up to the top of this page.

This site powered by the Logical Web Publisher™: Content management by Logical Expressions, Inc.