An ecological fallacy, often called an ecological inference fallacy, is an error in the interpretation of statistical data in an ecological study, whereby inferences about the nature of specific individuals are based solely upon aggregate statistics collected for the group to which those individuals belong. This fallacy assumes that individual members of a group have the average characteristics of the group at large. Stereotypes are one form of ecological fallacy, which assumes that groups are homogeneous. For example, if a particular group of people is measured to have a lower average IQ than the general population, it is an error to assume that all or most members of that group have a lower IQ than the general population. For any given individual from that group, there is no way to know if that person has a lower than average IQ, average IQ, or above average IQ compared to the general population.
A study is done that shows people from City A score higher on the SATs, on average, than people from City B. Making an assumption that a randomly selected individual from City A would have scored higher on the SATs than a randomly selected individual from City B would be an ecological fallacy. Since the SAT scores given in the study were an average and not a median, we know nothing about the distribution of the scores in the two Cities. It could be that the randomly selected individual from City A scored higher than the individual from City B.
If a particular sports team is described as performing poorly, it would be fallacious to conclude that each player on that team performs poorly. Because the performance of the team depends on each player, one excellent player and two terrible players may average out to three poor players. This does not diminish the excellence of the one player.
In the United States presidential elections of 2000, 2004 and 2008, wealthier states (states with higher per capita incomes) tended to vote Democratic and poorer states tended to vote Republican. Yet wealthier voters tended to vote Republican and poorer voters tended to vote Democratic. For example, in 2004, the Republican candidate, George W. Bush, won the fifteen poorest states, and the Democratic candidate, John Kerry, won 9 of the 11 wealthiest states. Yet 62% of voters with annual incomes over $200,000 voted for Bush, but only 36% of voters with annual incomes of $15,000 or less voted for Bush.
The ecological fallacy was discussed in a court challenge to the Washington gubernatorial election, 2004 in which a number of illegal voters were identified, after the election; their votes were unknown because the vote was by secret ballot. The challengers argued that illegal votes cast in the election would have followed the voting patterns of the precincts in which they had been cast, and thus adjustments should be made accordingly. An expert witness said this approach was like trying to figure out Ichiro Suzuki's batting average by looking at the batting average of the entire Seattle Mariners team since the illegal votes were cast only by males and the overall precinct votes included both males and females. The judge determined that the challengers' argument was an ecological fallacy, and rejected it.
Origin of concept
The term comes from a 1950 paper by William S. Robinson. For each of the 48 states in the US as of the 1930 census, he computed the literacy rate and the proportion of the population born outside the US. He showed that these two figures were associated with a positive correlation of 0.53 — in other words, the greater the proportion of immigrants in a state, the higher its average literacy. However, when individuals are considered, the correlation was −0.11 — immigrants were on average less literate than native citizens. Robinson showed that the positive correlation at the level of state populations was because immigrants tended to settle in states where the native population was more literate. He cautioned against deducing conclusions about individuals on the basis of population-level, or "ecological" data.
Hasty Generalization is a fallacy that is the direct opposite of the ecological fallacy. It occurs when one makes assumptions of a whole group based on insufficient data. Insufficient data often refers to a small sample size that is applied to a large population, though non-random samples can also introduce hasty generalization. For example, if one made a generalization about an entire state based only on a few cities of that state, then one would be making a hasty generalization. In GIS, hasty generalization can occur when an entire enumeration unit is represented by only a few data points, which may not represent the entire polygon.
Ecological Fallacy in GIS
Ecological fallacies can be generally grouped into a few divisions. They include:
- Confusion between aggregate correlations and individual correlations
- Confusion between the group average and the likelihood of the individuals to reflect that average
- Simpson's paradox
- Confusion between the idea that a group having a higher average of something occurring will also have a higher likelihood of that thing happening in the future
Though ecological fallacies can fall into several different groups, in GIS, the main issue results from the second problem. It is easy to look at a map and deduce that an individual within a group (i.e. one house in a city) will show the same value or trend as the average of all the individuals in that group. This issue can occur often as most maps use aggregate data. The most common aggregate data used in GIS is census data. Census data is easy to find and easy to map, but unfortunately, many overlook the fact that it is very susceptible to ecological fallacy. Thus, when using GIS to make or analyze maps, it is important to remember that a pattern shown in one area of the map does not mean that every individual in that area will exhibit the same pattern; it is merely a generalized reflection of group trends as a whole. Many individuals viewing maps are not likely to understand or know about this fallacy. Consequently, it is necessary to keep the audience in mind when designing a map. Using the smallest groupings available of the variable being mapped is a way to minimize variation in the area and prevent readers from adopting fallacious interpretations based on a map. This specific method of addressing ecological fallacy is described by the Modifiable Areal Unit Problem (MAUP).
- Ecological correlation
- Modifiable areal unit problem
- Prosecutor's fallacy
- Sampling (statistics)
- Simpson's paradox
- Statistical discrimination
- ↑ Gelman, Andrew; Park, David; Shor, Boris; Bafumi, Joseph; Cortina, Jeronimo (2008). Red State, Blue State, Rich State, Poor State. Princeton University Press. ISBN 978-0-691-13927-2.
- ↑ George Howland Jr. (May 18, 2005). "The Monkey Wrench Trial: Dino Rossi's challenge of the 2004 election is on shaky legal ground. But if he prevails, watch litigation become an option in close races everywhere". Seattle Weekly. http://www.seattleweekly.com/features/0520/050518_news_election.php.
- ↑ Borders et al. v. King County et al., transcript of the decision by Chelan County Superior Court Judge John Bridges, June 6, 2005, published: June 8, 2005
- ↑ Robinson, W.S. (1950). "Ecological Correlations and the Behavior of Individuals". American Sociological Review 15: 351–357. doi:10.2307/2087176.
- ↑ 
- ↑ Diamond, L. (2013). Geographic Data Assumptions: MAUP and Ecological Fallacies. GIS Collective. Retrieved from: http://giscollective.org/geographic-data-assumptions-maup-and-ecological-fallacies/
- ↑ M. Tanner, DG. Steel, Using Census Data to Investigate the Causes of the Ecological Fallacy Environment and Planning A May 1998 30: 817-831,/