Jenks Natural Breaks Classification

The Jenks Natural Breaks Classification (or Optimization) system is a data classification method designed to optimize the arrangement of a set of values into "natural" classes. A Natural class is the most optimal class range found "naturally" in a data set. A class range is composed of items with similar characteristics that form a "natural" group within a data set.

This classification method seeks to minimize the average deviation from the class mean while maximizing the deviation from the means of the other groups. The method reduces the variance within classes and maximizes the variance between classes. It is also known as the goodness of variance fit (GVF), which equals the subtraction of SDCM (sum of squared deviations for class means) from SDAM (sum of squared deviations for array mean).

Background
George Frederick Jenks was a professor at the University of Kansas from 1949-1986. He developed the Cartography Department there, and he primarily developed statistical methods for choropleth mapping. His "Natural Breaks" method is a one that attempts to normalize data in the most accurate way. His method is used broadly by cartographers when depicting ordinal data with about seven or fewer breaks, or classifications. Although the algorithm can become very long with large data sets it is a successful one when attempting to decrease the amount of deceptive information.

The Jenks scheme determines the best arrangement of values into classes by iteratively comparing sums of the squared difference between observed values within each class and class means. The best classification identifies breaks in the ordered distribution of values that minimizes the within-class sum of squared differences.

Jenks’ goal in developing this method was to create a map that was absolutely accurate, in terms of the representation of data’s spatial attributes. By following this process, Jenks claims, the “blanket of error” can be uniformly distributed across the mapped surface. He developed this with the intention of using relatively few data classes (fewer than seven) because that was the limit when using monochromatic shading on a choropleth map. In a publication in the Annals of the Association of American Geographers, Jenks and Caspall states that "readers are unable to discriminate between patterns when more than ten or eleven are used on a choroplethic representation"; thus, Jenks and Caspall used five classes on their maps, making it easier for any readers to differentiate between the classes. Also, unlike the optimal method, which uses a numerical measurement to separate data classes objectively, the Jenks natural breaks method classifies data subjectively.

Using Jenks Classification in GIS
Cartographers and map makers can utilize the Jenks method to identify logical break points in a data set by grouping similar values that "minimize differences between data values in the same class and maximize the differences between classes." The features are divided into classes whose boundaries are set where there are relatively big jumps in the data values. When making choropleth maps, the Jenks classification method can be advantageous because it identifies real classes within the data. Choropleth maps that use this method will accurately portray trends found in the data. However, users should note that the Jenks classification is not recommended for data that have a low variance.

In the legend of the example map, note the variance in the range of percentage values of groups in the map. The Jenks natural breaks in the data are utilized to provide a more meaningful visualization of map data based on the "natural breaks' in the data identified by the iterative process. Other methods of data classification used in GIS include Natural Breaks (without Jenks Optimization), Equal Interval, Defined, or Geometric Interval, Quantile, and Standard Deviation.