Geog 353 Lecture Outline: Map Generalization and Classification
We will review chapter 8 (Map Generalization and Classification) from the Making Maps book. Additional information and examples can be gleaned from the material below.

Data Classification


Recent Lectures: Issues concerning map symbolization: choosing visual marks to effectively represent the points, lines, and area data of our base maps and thematic data

Effective representation of intellectual hierarchy with a visual hierarchy

Visual variables

Next: ways to logically match the dimensions of your data (point, line, area) to symbols on your map

Requires understanding

1. Data Classification

Data is usually classified - put into some categories or groups - before it can be displayed

Different ways of classifying data will lead to different patterns on the map

Classification is a form of cartographic generalization which reduces the complexity of a set of thematic data

Classification: start by differentiating between

Categorical (qualitative, nominal) data classifications

Dealing with nominal (qualitative) data or data that is ordered but without a measurable range (rare as a type of mappable thematic data)

There are no absolute rules for this kind of classification, just general guidelines

Numerical data classifications

Ordered data with a measurable range: quantitative data

Two big issues involved in the classification of numerical or quantitative data

Number of classes:

Most maps for presentation purposes should have four to six classes

As you change the number of classes you may very well see different patterns:

Important to vary the number of classes and see what happens before you make a final choice

Number of Classes in ArcView

Number of Classes in ArcGIS

Also important is the way you divide up data: classification schemes

Data classification schemes

Histogram: graph relating data distribution and frequency

Some classification schemes take into account the distribution of the data, and others do not.

1. Exogenous schemes: class boundaries defined by criteria external to distribution of data

Advantage: map can be matched to external criteria

Disadvantage: does not take into account the data distribution

Exogenous schemes in ArcView

Exogenous schemes in ArcGIS

2. Arbitrary schemes: class boundaries are set by arbitrary criteria

Equal Intervals: class boundaries are defined by rounded numbers or regular divisions

Often chosen because the classification looks tidy

Simple to do by hand:


Disadvantage: not sensitive to the data distribution (if not rectangular)

Equal Interval schemes in ArcView

Equal Interval and Defined Interval schemes in ArcGIS

3. Ideographic schemes: class boundaries defined by the shape of the data distribution

Ideographic schemes take more effort because they are chosen based on some characteristics of the data distribution itself

3a. Natural Breaks: Attempt to find natural breaks in the data; classify data into groups that are somewhat distinct from each other. Can do this by hand using a cumulative frequency graph (or graphic array) and then look for natural breaks in the data and put class breaks at those points

A good default method: good to start with this and see if it works

How to do it: start by creating a histogram



Natural Breaks in ArcView

Natural Break (Jenks) scheme in ArcGIS

3b. Quantiles: puts an equal number of values in each class

Easy to calculate



Quantiles in ArcView

Quantile scheme in ArcGIS

4. Serial schemes: class boundaries are defined by statistical or mathematical functions

Standard Deviation

Class boundaries determined by the mean and standard deviation

Normal distribution: values near the mean occur more often

Other distributions: not normal: more dispersed

Standard deviation: a measure of how dispersed a set of data is



Standard Deviation schemes in ArcView

Standard Deviation scheme in ArcGIS

5. Unclassified Schemes

"Unclassed" choropleth maps: the number of categories is equal to the number of data values

Each value has a unique symbol



Unclassified Schemes in ArcView or ArcGIS

Sum: data classification

1) categorical (nominal, ordinal) vs numerical (interval, ratio) data

2) number of classes

3) dividing up data: numerical classification

Change the classification scheme or number of classes and you get a different map

If all three classification schemes are appropriate for the data distribution then select the classification scheme that best represents what you know about the actual data distribution.

