Histogram and Box Plot
Histograms and box plots are used to graphically summarize data. Select either Histogram or Box Model to view a given data set.
Histogram
A histogram divides the range of values in a data set into intervals. A block or rectangle whose area represents the percentage of data values in the interval is placed over each interval.
The data set initially displayed comes from observations of Old Faithful, the famous geyser in Yellowstone National Park. There are 112 measurements of times between eruptions. Notice that there are two main clusters. This behavior is explained by the fact that when Old Faithful has a particularly short, less-impressive eruption, the waiting time until the next eruption tends to be short. The waiting time after a spectacular eruption is usually longer than average.
You can add your own data values to the Old Faithful data, or you can create your data set from scratch. Data values must be numeric. Use a standard keyboard to delete or type a new data value. When new data values are added, the histogram and the descriptive statistics are updated.
Click and drag the slider below the histogram to make cell widths larger or smaller.
The Average for all data is shown in red on the bottom of the histogram. Place the mouse cursor over an endpoint of the interval to display its value.
Click on a rectangle to display the percentage of data values in the corresponding interval.
Box Plot
To construct a box plot, you need five numbers: the minimum data value, the lower quartile, the median, the upper quartile, and the maximum data value. Mark these five statistics on a number line and draw a box from the lower quartile to the upper quartile. Mark the median inside the box.
You can add your own data values to the Old Faithful data, or you can create your data set from scratch. Data values must be numeric. Use a standard keyboard to delete or type in new data values. When data are changed, the box plot and the descriptive statistics are updated.
View the Old Faithful data as a box plot and then as a histogram. Compare the two graphical summaries. What are the advantages and disadvantages of each display?