Business Statistics Unit 2

Organizing, Presenting and Evaluating Data

In this unit we will examine several methods to organize and present data.  We also discuss the methods you should use when you evaluate statistical data.  Remember to obtain a copy of the learning objectives for this unit by going to the Unit 2 link under Course Documents in Blackboard, and clicking on the link for Learning Objectives, Unit 2.

FREQUENCY DISTRIBUTIONS


The first topic we are going to discuss is a frequency distribution.  What is a frequency distribution?  A frequency distribution is a method used to organize data into classes that shows the number of observations within each class.  Each class within a frequency distribution is unique meaning that each observation can only be assigned to one class.  We construct frequency distributions so that we can look at data in a more organized way to extract information and analyze the data.  Often a frequency distribution is the first method used to analyze data after it has been collected from a sample or a population.

For example, a survey was recently conducted in Detroit in regards to the prices paid for a selected group of consumer products.  One item that was included in the survey was the price of pre-recorded music CDs.  The table below shows the price paid at each store included in the survey for the new Tim McGraw CD.

Table 2.1  Prices paid in dollars for the Tim McGraw CD at Metro Detroit Stores.

$14.95 $16.95 $11.95 $14.95 $13.95
  18.95   16.25   14.99   15.55   16.65
  15.45   15.10   20.25   11.50   17.00
  12.99   19.95   19.30   16.95   18.40
  11.99   17.50   13.65   12.88   17.75

The data presented in Table 2.1 is not organized into any specific order or group.  This is referred to as raw or ungrouped data.  If we look through the table we can find out some basic information.  The highest price paid was $20.25 while the lowest price paid was $11.50.  A frequently used method to examine data is to calculate the range.  The range is simply the distance between the highest and lowest values in a set of data.  Looking at table 2.1, the highest value is $20.25 and the lowest value is $11.50.  The range is calculated by subtracting the lowest value from the highest value.  So the range for our data is $20.25 - $11.50 = $8.75.  The range is a crude measure of dispersion, or how our data is spread out.  But it is very difficult to obtain any other type of information about the prices paid for the CD.  One way to clarify the data is to organize the data into a frequency distribution.  In order to properly develop a frequency distribution, several rules must be followed.

  1. Number of classes - The number of classes can vary depending upon the number of observations being grouped and how you prefer having your data presented.  If you are trying to present your data in comparison with data that has already been presented in a frequency distribution, it is normal to use the same classes.  Usually there are between 5 - 10 classes and in some cases up to 20 classes have been used.  Although some statisticians have developed formulas to determine the number of classes, often your best judgment or past experience can guide you.  As a general rule 5 to 7 classes is probably the most popular number of classes.  For our example from table 2.1 we will use 5 classes.  There must be enough classes to cover all of the data.
  2. Class width - The class width or interval for each class should be the same amount (width) for all classes.  All of the classes in total must include all of the raw data.  No data should be left out of a class.  As a general rule to determine the class width, subtract the lowest value from the highest value and divide by the number of classes you are going to use.  This will give you an approximate idea of what your class width should be.  Normally the class values are rounded to a number than is easy to read, usually a multiple of 5, 10, 100, or 1000.  In our example in table 2.1, the highest value ($20.25) minus the lowest value ($11.50) equals $8.75.  If we divide $8.75 by 5 which is the number of classes we have selected, we get $1.75.  This becomes our suggested class width.  In order to make the data more presentable, we will use a class width of $2.00.  Many statisticians prefer a class width that is an odd number however it is not required.  If you are looking at a frequency distribution and need to determine the width being using, subtract the lower (or upper) class limit of one class from the lower (or upper) class limit of the very next class to find the actual width being used.
  3. Avoid open class intervals (mutually exclusive) - These are the less, less than, greater, or greater than symbols (<, >) or words that you sometimes find at both ends of  frequency distributions.  For example, our first class interval for table 2.1 is $11.00 - 12.99.  If we had an open class interval here it might say $12.99 or less.  It is OK to use greater than or less than symbols or words with class intervals to separate the classes.  For example for the first class interval you could say $11.00 to less than 13.00 instead of $11.00 - 12.99.

NOTE:  If you are preparing a frequency distribution that compares similar data from different data sets, you should always keep the number of classes and the class widths identical for all data sets.  This would also apply to the data analysis project for this class.

Using the data from table 2.1, the following frequency distribution could be constructed using 5 classes and an interval width of $2.00.

Table 2.2.  Frequency Distribution of CD Data.

Class Number of observations
$11.00 - 12.99 5
$13.00 - 14.99 6
$15.00 - 16.99 6
$17.00 - 18.99 5
$19.00 - 20.99 3

The data is now in a format where more information can be readily extracted from the data.

Two other computations can be made from a frequency distribution.  The first is called the class midpoint.  The class midpoint is simply the value that is halfway between the lowest and highest value of a class.  To find the midpoint value, simply add the lowest value and the highest value within a class, and divide the result by 2.  In table 2.2, for the first class which is $11.00 - 12.99, the midpoint is equal to $11.00 + 12.99, which is $23.99.  Then we divide $23.99 by 2 and we find our midpoint value is equal to $11.995.  This is our midpoint value.

The second computation that can be made from a frequency distribution is a class boundary.  The class boundary is the value that falls between the upper limit of one class and the lower limit of the next class.  To find the class boundary, add the upper limit of one class to the lower limit of the next class and divide by 2.  In our example in table 2.2 the class boundary between the first class and the second class is found by adding $12.99 + $13.00 which is $25.99.  Then we divide $25.99 by 2 and we find our boundary is at $12.995.  By now you are wondering why it is important to know the class boundary.  Well in our example it is unlikely that any CD prices will fall between $12.99 and 13.00.  But if we were looking at a different set of data, it is possible that we may need to determine the class boundary in order to properly assign one of our observations.

Often graphs are constructed from frequency distributions to illustrate the data visually.  The proper graph for most frequency distributions is a histogram (bar graph) that is either vertical or horizontal.  Graphs will be discussed later in this presentation in more detail.

PERCENTAGE DISTRIBUTIONS


Percentage distributions also known as relative frequency distributions are similar to frequency distributions and they follow the same set of rules.  The difference is that a percentage is shown instead of or in addition to the number of observations for each class.

Let us take a look at table 2.2 again.

Table 2.2.  Frequency Distribution of CD Data.

Class Number of observations
$11.00 - 12.99 5
$13.00 - 14.99 6
$15.00 - 16.99 6
$17.00 - 18.99 5
$19.00 - 20.99 3

To convert this table into a percentage distribution, we simply need to determine the percentage of observations within each class.  To do this, we take the number of observation for each class and divide that number by the total number of observations.  Then we convert the decimal number to a percentage.  Let's do this for Table 2.2.  For the first class, $11.00 - 12.99, the number of observations is 5.  The total number of observations for all classes is 25 (5+6+6+5+3).  Therefore we divide 5 by 25 and we get 0.20.  We convert the decimal to a percent so our percentage value for the first class is 20%.  For the second class, $13.00 - 14.99, the number of observations is 6.  Therefore we take 6 and divide it by 25 and we get 0.24.  Converting this number to a percent we get 24%.  Table 2.3 provides us with a complete percentage distribution for our CD data.

Table 2.2.  Percentage Distribution of CD Data.

Class Number of Observations Percentage
$11.00 - 12.99 5 20%
$13.00 - 14.99 6 24%
$15.00 - 16.99 6 24%
$17.00 - 18.99 5 20%
$19.00 - 20.99 3 12%

Once again by constructing a percentage distribution we provide additional information about our data.

GRAPHICAL REPRESENTATION OF DATA


Statistical data often contains a significant amount of information.  Using frequency and percentage distributions helps explain the data in an organized manner.  Another method often used to explain statistical data is a graph.  There are many types of graphs that can be used with statistical data, and some types of graphs are better than other types depending upon the type of data you are presenting.

Why should you use a graph?  The best reason is that a graph can quickly explain a relationship and it will provide the reader with a quick understanding of the data.  For this reason, you must be careful about the type of graphs you use and how it explains the data.

HISTOGRAMS/BAR CHARTS


One of the most popular types of graphs that is used is called a histogram, column, or bar graph.  A histogram is simply a graph that uses bars to illustrate quantities by each classification.  Most histograms are vertical but they can also be horizontal too.  Histograms are frequently constructed from frequency or percentage distributions.  Histograms are a great way to compare data.  However if a large number of numbers are used, histograms can be confusing.  Often then, especially with historical data over time, we use frequency polygons.  As a general rule, if you are looking at data over time and have more than 10 data values, you should use a frequency polygon instead of a histogram.  There are exceptions to this, but it generally is a good rule to follow.  Let's take a look at a histogram for the data presented in table 2.2.

Figure 2.1.  Histogram of CD Sales by Price Group.

Graph of CD sales by price category.

As you examine the histogram, notice that the height of each bar corresponds to the number of observations for each price class.  The data is organized similar to the frequency distribution, but by using a histogram, you can visually see the relationship between the different price classes or ranges.  For a vertical histogram, the x-axis is the horizontal axis which contains each class or group of data.  The y-axis is the vertical axis and contains the number of observations or frequencies for each class or group.  Both the x and the y axis should be labeled, and the graph should also contain a title.

One important item should be mentioned at this point.  Any graph that you create can illustrate your data differently simply by changing its scale.  Scale refers to the distance between the numbers used on the y-axis.  By changing the scale, you can change how the graph appears visually and change the perception of the differences between the classes or groups.  In Figure 2.1 above, the scale had an interval of 1 unit and ran from 0 to 7.  What would happen if the scale interval ran from 0 to 20?  Lets examine this by looking a Figure 2.2.

Figure 2.2.  Revised Histogram of CD Sales by Price Group - Scale to 20

Histogram of CD sales - revised scale to 20

Did you notice anything different?  Do the differences between the classes (price ranges) seem smaller in Figure 2.2 than in Figure 2.1?  Most people would say yes.  This is because the scale has changed.  Although it may not make any sense to break down CD units sold in this manner, statistical data is often manipulated by changing the scale.  Why?  Usually to make the differences appear smaller or larger than they really are.

Let's look at an example of using a small scale to make differences look larger.  Take a look at the ad below.

The histogram in the ad illustrates the weekly drop in cholesterol.  Notice the scale.  It runs from 196 to 210.  That is only a 14 point range.  The histogram makes it look like the weekly drop in cholesterol is substantial.  Yet over a four week period, the drop is only 14 points.

Histograms are excellent graphs for showing counts or percentages by classes or groups.  You can easily compare the classes and determine which ones have the highest and lowest counts.  Lets take a look at some new data and prepare a histogram of the data using Microsoft Excel®.  To follow along with this practice exercise, you should open Microsoft Excel in another window.

Practice Exercise 2.1 - Drawing a Histogram Using Excel 2007 or 2003.


For this example we are using data obtained from a survey of laptop computer users.  This survey was conducted to determine the brand preferences of consumers who own laptop computers.  One thousand consumers were surveyed and the survey results are given below.

Manufacturer

Number of Consumers Owning this Brand

Percent of Ownership

Apple 125 12.5
Dell 200 20.0
Gateway 100 10.0
HP/Compaq 150 15.0
IBM 175 17.5
Sony 75 7.5
Toshiba 50 5.0
All Others 125 12.5

The table provides us with the basic information we need to construct a histogram (bar chart) using Microsoft Excel.  Before copying the table into Excel, make sure that your column widths are sufficiently wide to accommodate the data and headings.  Then simply copy the entire table into a blank worksheet starting at column A1.  If you copied the table correctly, column A1 contains the manufacturer headings, column B1 contains the number of consumers owning this brand, and column C1 contains the percent of ownership.  Once you have verified that everything is OK, move your curser to position C10 in the worksheet.

NOTE:  Excel 2007 calls vertical bar charts (histograms) column charts and horizontal bar charts (histograms) bar charts.  The differences in terms can be confusing.

Instructions for Excel 2007

Creating Charts in Excel 2007 Creating a Chart - continued

From the main menu in Excel 2007, click on the Insert folder.  Click on the Column chart type option.  You will see several options for different column chart types.  Click on the first chart shown in the 2-D Column section.  Notice that your chart is drawn automatically and you have several new options.  If the chart does not contain the right set of data, click on Select Data and use your mouse to specify the data range.  Make sure you include the labels in your data range.

Click on the Layout tab.  A new set of menu options appear including Chart Title, Axis Titles, Legend, and Data Labels.  Click on Chart Title.  Then select the Centered Overlay Title option.  Notice that the words Chart Title appear in your chart.  Click in the box where Chart Title appears and type in Consumer Laptop Preferences.  Now click on Axis Titles.  You will need to select both the horizontal and vertical axis titles.  For the horizontal title, use Brand.  For the vertical axis title use Quantity.  Now click on Legend.  For this chart using a legend is not necessary so click on None.  You are done!

Chart created in Excel 2007 - Menu Options

You should experiment using Excel to create a variety of charts types, and explore the options for modifying charts in Excel.  For example, you may want to add different labels for each series of data, or change the chart title.  The easiest way to modify an existing chart is to right click with your mouse just inside the outer border.  Then select the option to change the chart type or data series.

Instructions for Excel 2003

Your next step is to click on the Chart Wizard icon in Excel, or click on the command Insert, then Chart.  This brings up the Chart Wizard.  Using the Chart Wizard you can select a variety of different types of graphs to construct.  We are going to select a column chart, which is a vertical bar chart in Excel.  Select column as your type, the first type of column chart is highlighted which is the type we are choosing, and then click on next.  Excel then asks you for the data range, which is the area of your spreadsheet that contains the data and labels you wish to use.  Using your mouse, make sure your curser is already blinking in the data range box in the Chart Wizard, then using your mouse, click on the A1 box in your spreadsheet, hold the left button of your mouse down with your finger, and move to the bottom of the B1 column (B9).  Your data range should be A1:B9.  For our graph, we are not using the percentage data.  Finally, verify that in the Chart Wizard, the box next to Series, Columns is checked.  Now click on next.

You should now be at step 3 in the Chart Wizard where you are asked to enter the chart title, the X axis label, and the Y axis label.  For the chart title, erase what Excel has entered for you, and type in:  Consumer Laptop Preferences.  For the X axis label, type in:  Brand.  For the y axis label type in:  Quantity.  Next you need to click on the Legend tab in the Chart Wizard.  You should then see a box marked:  Show Legend.  Uncheck this box to remove the legend which is not needed for this type of graph.  Now click on Next again.  You are almost done!  At this step (#4) in the Chart Wizard you are given the option of placing your chart in a new sheet or as an object in the existing spreadsheet.  Go ahead and select the option which lets you place the chart as an object into your existing sheet (may be called sheet1).  Then click on Finish.  You should have a chart inserted into your spreadsheet that looks something like the one below.

Histogram of consumer laptop preferences

Excel makes the production of graphs very easy.  In addition you may have noticed that you can change the fonts of the text, the colors of the bars and background, and the size of the chart.  You should experiment with the different options until you are comfortable with using Excel to create graphs.


FREQUENCY POLYGONS/LINE CHARTS


A frequency polygon, also know as a line chart, is another way to present data in a graphical format.  In all cases, one or more vertical lines are used to connect data points on a chart.  Frequency polygons are most commonly used to show changes in data over time although there are other uses as well.  That means these graphs can show trends or changes well.  There are often used for historical data like sales history.

Let's go back to our favorite item to purchase - CDs.  Take a look at Table 2.3 below.  It provides us with CD sales data for the past 15 years for a popular music store location.

Table 2.3.  Total CD Unit Sales By Year, 1990 - 2004.

Year 2004 2003 2002 2001 2000 1999 1998 1997 1996 1995 1994 1993 1992 1991 1990
Unit Sales 10,252 11,554 12,857 13,995 15,465 15,105 14,252 13,500 12,963 12,332 11,506 10,121 9,857 9,005 8,197

Since our table only covers 15 years and the sales at one store, even without a graph the sales trend can be easily determined.  Sales initially grew from 1990 to 2000 and then declined.  A frequency polygon enables us to quickly see the sales trend without specifically looking at the data.  Figure 2.3 gives us the same data in a frequency polygon chart.

Figure 2.3.  Frequency Polygon of CD Unit Sales by Year, 1990 - 2004.

Frequency polygon of CD unit sales

Notice that the graph in Figure 2.3 shows the most recent year on the left first, and subsequent years to the right.  The graph could have been constructed with the 1990 on the left with data from 2004 on the right.  Often sales graphs portray the most recent data first with historical data listed or displayed following the recent data.

A quick glance at this graph immediately tells you something about CD sales:  since the year 2000 CD sales have fallen but prior to that time sales were increasing at a significant rate.  The graph does not answer why sales have changed.  Can you guess why sales have changed?  Could it be a new music store opened across the street from this store, or is it a problem with MP3 players and downloaded music?  Hmmm.....

Now that we know what a frequency polygon or line chart is, what it is commonly used for, and what it looks like, we can now confidently construct one together.  In the practice exercise below, we will use Excel to draw a frequency polygon.  So if you have not already done so, please open Excel in another window.

Practice Exercise 2.2. - Drawing a Frequency Polygon Using Excel 2007 or 2003.


For this exercise, we will take a look at the changes in the price of Stryker Corporation stock over a one year time period using monthly data.  In the table below you will the monthly closing price for Stryker Corporation stock for the time period July 31, 2003 - June 30, 2004.

Month Closing Price
July $38.31
August $38.32
September $37.66
October $40.56
November $41.03
December $42.51
January $44.37
February $44.37
March $44.27
April $49.47
May $50.85
June $55.00

Historical stock data provides us with an excellent example of using a frequency polygon to show a trend related to the stock price.

Excel 2007

Copy the table into a blank worksheet starting at column A1.  You may need to adjust your column widths.  Leave your cursor at position B14.

From the main menu in Excel 2007, click on the Insert folder.  Click on the Line chart type option.  You will see several options for different line chart types.  Click on the first chart shown in the 2-D Line section.  Notice that your chart is drawn automatically and you have several new options.  If the chart does not contain the right set of data, click on Select Data and use your mouse to specify the data range.  Make sure you include your labels in the data range.

Click on the Layout tab.  A new set of menu options appear including Chart Title, Axis Titles, Legend, and Data Labels.  Click on Chart Title.  Then select the Centered Overlay Title option.  Notice that the words Chart Title appear in your chart.  Click in the box where Chart Title appears and type in Consumer Laptop Preferences.  Now click on Axis Titles.  You will need to select both the horizontal and vertical axis titles.  For the horizontal title, use Brand.  For the vertical axis title use Quantity.  Now click on Legend.  For this chart using a legend is not necessary so click on None.  You are done!

Excel 2003

If you have not already done so, open another window for Excel on your computer.  Before copying the table into Excel, make sure that your column widths are sufficiently wide to accommodate the data and headings.  Then simply copy the entire table into a blank worksheet starting at column A1.  If you copied the table correctly, column A1 contains the month and column B1 contains the stock price.  Once you have verified that everything is OK, move your curser to position B14 in the worksheet.

Your next step is to click on the Chart Wizard icon in Excel, or click on the command Insert, then Chart.  This brings up the Chart Wizard.  Using the Chart Wizard you can select a variety of different types of graphs to construct.  We are going to select a line chart.  Select line as your type, then click in the first type of line chart in the sub-type section, and then click on next.  Excel then asks you for the data range, which is the area of your spreadsheet that contains the data and labels you wish to use.  If Excel has already selected the data range A1:B13, don't change it.  If no range is specified, make sure your curser is already blinking in the data range box in the Chart Wizard, then using your mouse, click on the A1 box in your spreadsheet, hold the left button of your mouse down with your finger, and move to the bottom of the B1 column (B13).  Your data range should be A1:B13.  Finally, verify that in the Chart Wizard, the box next to Series, Columns is checked.  Now click on Next.

You should now be at step 3 in the Chart Wizard where you are asked to enter the chart title, the X axis label, and the Y axis label.  For the chart title, erase what Excel has entered for you, and type in:  Stryker Corporation Closing Stock Price.  For the X axis label, type in:  Month.  For the y axis label type in:  Price.  Next you need to click on the Legend tab in the Chart Wizard.  You should then see a box marked:  Show Legend.  Uncheck this box to remove the legend which is not needed for this type of graph.  Now click on Next again.  You are almost done!  At this step (#4) in the Chart Wizard you are given the option of placing your chart in a new sheet or as an object in the existing spreadsheet.  Go ahead and select the option which lets you place the chart as an object into your existing sheet (may be called sheet1).  Then click on Finish.  You should have a chart inserted into your spreadsheet that looks something like the one below.

Frequency polygon of Stryker stock prices

Once again by using Excel we can easily create a line graph and modify colors, fonts, backgrounds, and text as needed.


Frequency polygons are often used to compare two or more sets of data.  Looking at our previous example of CD unit sales for a store location, two or more locations could be included with each set of data having its own line.  Comparisons could be made in regards to the trends for all sets of data simply by visually examining the graphs.

A special type of frequency polygon used to display cumulative frequencies is called a cumulative frequency graph or ogive.  Essentially this graph illustrates the cumulative frequencies from a frequency distribution that is organized into classes.

Pie Charts


It is probably safe to assume that everyone taking this course has seen a pie chart sometime during their life.  They are a popular way to display data and we see them often in newspapers, magazines, television news, and on the Internet.  Although histograms are also popular, there is just something about a pie chart...

Ok, what is a pie chart?  A pie chart is simply a circular line with slices indicating a portion or percentage of the total circle or pie.  Pie charts are often used to show percentages.  They are often used in conjunction with budgetary data, production by product type, and market share data.  Let's take a look at a pie chart that is used to present budgetary data for a business.

Figure 2.4.  Pie Chart of Expenses for Fred's Mowing Service

Pie chart of Fred's Mowing Service expenses

Pie charts will generally break down data into categories and/or percentages.  In our sample chart 2.4, the cost data is divided into categories and the percentage for each category is shown within the pie chart.  Notice that you can readily determine which costs are the highest percentage for the company.  Instead of percentages, actual dollar amounts could be shown.

You should never use a pie chart to show historical data over time.  It is also a good idea not to use a pie chart for data in a frequency distribution.

Excel really does a great job in drawing pie charts.  In the practice exercise below, we will use Excel to draw a pie chart using some sample data.

Practice Exercise 2.3.  Drawing a Pie Chart Using Excel 2007 or 2003.


The table below contains data from the U.S. federal government.  For the 2003 fiscal year, total federal government receipts are shown by revenue source.  Source:  Federal Budget for Fiscal Year 2005 Historical Tables.

Revenue Source Amount (in millions of dollars)
Individual Income Taxes 793,699
Corporation Income Taxes 131,778
Social Insurance and Retirement Taxes 712,978
Excise Taxes 67,524
Other 76,363

Using the data contained in this table, we will construct a pie chart using Excel.

Excel 2007

Copy the table into a blank worksheet starting at column A1.  You may need to adjust your column widths.  Leave your cursor at position B7.

From the main menu in Excel 2007, click on the Insert folder.  Click on the Pie chart type option.  You will see several options for different pie chart types.  Click on the first chart shown in the 2-D pie section.  Notice that your chart is drawn automatically and you have several new options.  If the chart does not contain the right set of data, click on Select Data and use your mouse to specify the data range.  Make sure you include your labels in the data range.

Click on the Layout tab.  A new set of menu options appear including Chart Title, Axis Titles, Legend, and Data Labels.  Click on Chart Title.  Then select the Centered Overlay Title option.  Notice that the words Chart Title appear in your chart.  Click in the box where Chart Title appears and type in Consumer Laptop Preferences.  Now click on Axis Titles.  You will need to select both the horizontal and vertical axis titles.  For the horizontal title, use Brand.  For the vertical axis title use Quantity.  Now click on Legend.  For this chart using a legend is not necessary so click on None.  You are done!

Excel 2003

If you have not already done so, open another window for Excel on your computer.  Before copying the table into Excel, make sure that your column widths are sufficiently wide to accommodate the data and headings.  Then simply copy the entire table into a blank worksheet starting at column A1.  If you copied the table correctly, column A1 contains the revenue source and column B1 contains the dollar amount.  Once you have verified that everything is OK, move your curser to position A7 in the worksheet.

Your next step is to click on the Chart Wizard icon in Excel, or click on the command Insert, then Chart.  This brings up the Chart Wizard.  Using the Chart Wizard you can select a variety of different types of graphs to construct.  We are going to select a pie chart.  Select pie as your type, then click in the first type of pie chart pictured in the sub-type section, and then click on next.  Excel then asks you for the data range, which is the area of your spreadsheet that contains the data and labels you wish to use.  If Excel has already selected the data range A1:B7, don't change it.  If no range is specified, make sure your curser is already blinking in the data range box in the Chart Wizard, then using your mouse, click on the A1 box in your spreadsheet, hold the left button of your mouse down with your finger, and move to the bottom of the B1 column (B7).  Your data range should be A1:B7.  Finally, verify that in the Chart Wizard, the box next to Series, Columns is checked.  Now click on Next.

You should now be at step 3 in the Chart Wizard where you are asked to enter the chart title, the X axis label, and the Y axis label.  For the chart title, erase what Excel has entered for you, and type in:  Federal Tax Receipts by Source - 2003.  Now click on the Data Labels tab.  We are going to add some additional labels to our chart.  Click inside the box next to Percentage under the Label Contains section.  This will enable Excel to calculate and place a percentage value next to each slice of the pie.  Now click on Next again.  You are almost done!  At this step (#4) in the Chart Wizard you are given the option of placing your chart in a new sheet or as an object in the existing spreadsheet.  Go ahead and select the option which lets you place the chart as an object into your existing sheet (may be called sheet1).  Then click on Finish.  You should have a chart inserted into your spreadsheet that looks something like the one below.

Pie chart of Federal Tax receipts by source

Excel automatically selects different colors for each pie slice, produces a legend, and indicates the percentage for each tax source.  You can change pie slice colors or designs, the type of pie chart, and the order of the items.

Symmetrical and Skewed Distributions of Data


When we examine a series of data values, we often look to see if the series of values have a specific shape.  In order to determine the shape of a series of data values, we often construct a line chart and plot the data values on the chart.

Looking at Figure 2.5 below, the data on this chart is symmetrical.  A symmetrical distribution has the same number of values above and below the mean which is represented by the peak of the curve.  The mean and median in a symmetrical distribution are equal.

Figure 2.5.  A Symmetrical Distribution of Data.

Symmetrical distribution

Sometimes we have a distribution of data that contains extreme values.  When this occurs we have a skewed distribution.  When the extreme values are above the center or mean value, we have a positively skewed distribution.  The shape of the distribution is illustrated in Figure 2.6.  Notice that positively skewed distributions have a tail to the right, but the extreme values are located on the left side of the distribution.  In this situation, the mean is greater than the median.

Figure 2.6.  A Positively Skewed Distribution of Data.

Picture of positively skewed data

If the extreme values are below the center or mean value then we have a negatively skewed distribution.  The shape of this distribution is illustrated in Figure 2.7.  Notice that for a negatively skewed distribution, the extreme values on the right side produce a left sided tail.  In this situation, the mean is less than the median.

Figure 2.7.  A Negatively Skewed Distribution of Data.

 

Picture of negatively skewed data

There are several statistical methods using formulas that can be used to determine the skewness of a set of data.  Excel can automatically calculate the skewness of a set of data using the Descriptive Statistics option.  It is commonly reported along with the mean, median, mode, and range.  A positive value indicates that the data values are positively skewed.  A negative value indicates that the data values are negatively skewed.  We will examine this in a later unit.


Summary


In this unit we have discussed the use of frequency and percentage distributions, along with the use of histograms, frequency polygons, and pie charts, to present statistical data.  You should be aware that there are other methods and types of graphs that can be used in addition to the ones we have examined.  The graphs and methods we discussed are the most commonly used by business, government, and researchers today.  We can visually look at a distribution of data and determine if it is symmetrical or skewed.  We can also find this using Excel.

Assignment


The homework problems for unit 2 can be found by going to the Course Documents link in Blackboard, and clicking on the link for Unit 2.  Look for the Unit 2 - Assignment link.  Once you have completed your homework assignment, you will need to post your answers on Blackboard®.  Once you have signed into Blackboard, simply go to the Unit 2 - Assignment 2 - Post Answers Here link and post your answers.  Immediate feedback is provided once you have completed the posting of all of your answers and clicked on submit.  Make sure you print the entire submitted homework assignment to assist you with quizzes and tests.

 

©2008, 2007  by E.H. McKay, III.  Version 5.0

Some Images © 2006, 2004 by Clipart.com.