By many measures, Mexico City is in a terrible location. Wedged between volcanic mountains in a highland basin on top of a filled lakebed, the geography of the city leaves it vulnerable to earthquakes and volcanoes, as well as to cycles of flooding and drought. Once called the “most polluted city in the word,” its topography traps smog in the urban core, leading to unhealthy levels of air pollution. The city is literally sinking, by as much as eight meters over the last century in some locations, because of a rapidly depleting underground aquifer that continues to be chronically overdrawn due to water scarcity. At least one journalist characterized the location as “the worst possible place to build a megacity.”
Why is Mexico City located in such a challenging environment? The standard explanation is historical persistence. What we now call Mexico City was built over the city of Tenochtitlan, the political capital of the Triple Alliance (Aztec Empire). When it was settled, the city’s location along a network of lakes was advantageous, enabling long-distance trade in food and other goods that could not be transported easily over mountainous terrain. Following the Conquest, the Spanish repurposed the pre-existing political and economic institutions of the Triple Alliance to consolidate colonial rule. The city became the center of Spanish colonial power in the Americas and, centuries later, the political capital of an independent Mexico. The lakes are now gone and the area’s geography has been transformed in numerous ways, but Mexico City remains the economic, political, and cultural center of the country.
These sorts of long-run persistence arguments are common to HPE research. One thing that the Mexico City example illustrates is the complex role that physical geography—an area’s climate, topography, soil type, access to water resources, and other locational features—plays in the evolution of past and present development patterns. Certain initial conditions led people to settle in the Valley of Mexico. Over time both the physical geography of the Valley itself and its consequences for political and economic development changed. This physical geography shaped (and continues to shape) political institutions and outcomes over time, providing an empirical challenge for those of us who work on HPE research in this region.
In this post, I highlight a couple of the challenges faced by HPE researchers in addressing physical geography in empirical work. Adam Slez tackled some important geographical issues related to units of analysis, the boundary problem, and spatial dependence in a series of earlier posts. My focus here is narrower: When and why should we consider physical or natural geography in HPE research? How reliable are standard measures of physical geography when applied to the recent or distant past? Can we reasonably take physical geography to be exogenous?
Geography as confound, geography as cause
Even scholars who have no direct interest in geography have to contend with it in applied work as a potential confound. Nearly every political or economic outcome bears some relationship with geography. The standard concern is that one or more geographic characteristics of a location can independently affect both the explanatory variable and the outcome. For example, if we are interested in how past or present state centralization affects economic development, we need to consider how geographic fundamentals (for example the climate or river patterns that determine agricultural production technology) may have influenced both the incentives for state centralization and the subsequent pattern of economic growth.
The potential issues for inference are highlighted in a working paper by Morgan Kelly on studying historical persistence (which Adam referenced in his most recent post). In addition to the fundamental problem of spatial dependence that Adam discussed, Kelly shows that parameter estimates in many of the classic works on the historical persistence are very sensitive to the inclusion of standard geographic controls. For example, the estimated negative effect of slave exports on present-day income in Nunn (2008) shrinks by nearly two-thirds when a measure of malaria vulnerability, which plausibly affects development directly, is included. Similarly, Kelly finds that the size of the estimated effect of pre-colonial institutions on contemporary development in Michalopoulos and Papaioannou (2013) declines considerably when a control for distance to the equator is added. Kelly highlights these papers not because they were done poorly (on the contrary, these are highly influential works published in some of the top journals of our discipline), but because they illustrate that the common pitfall of omitting important geographic variables can affect even exceptional work in this field.
Studying geography as a “cause” of later political or economic outcomes has its own set of issues. Here a challenge can be illustrating that the hypothesized channel linking geography and outcomes is the correct one. One of the things, for example, that makes the Bleakley and Lin (2012) paper on path dependence in urban development in the United States compelling is the clear reasoning that they give for why their geographic variable of interest—portage sites between navigable waterways—was instrumental in determining early patterns of population concentration but should have a negligible direct effect on economic behavior today due to changing transport technology. Ruling out alternative causal channels is especially important for research designs that use geographic factors as an exogenous source of variation to predict past or present outcomes. A common critique of the influential Acemoglu, Johnson, and Robinson (2001) article examining the link between institutions and economic development is that the geographic instrument they use to predict differences in institutions, colonial settler mortality, captures factors like average temperature that shape the disease environment and plausibly also have direct effects on institutions and development.
Another challenge when studying geography over a long period of time is that the way in which geographic factors shape political and economic outcomes can change over time. One of the things that Jennifer Alix-Garcia and I explore in a recent paper is how the effect of geography on urbanization in Mexico changed alongside shifts in technology and policy. A good example is the effect of distance to the U.S. border. Prior to the 20th century, being near the U.S. had no discernable effect on urbanization conditional on other geographic covariates. It would have been strange if we had found that proximity to the border shaped early urban development given that no international border existed in that location until the 19th century. However, as the 20th century progressed as Mexico-U.S. trade grew in economic importance, this factor became an important determinant of population concentration. Jen and I were interested in the substantive question of how the effect of geographic fundamentals changes over time, but the shifting effect of geography has methodological implications as well. For example, it highlights the importance of allowing the relationship between geography and outcomes to change flexibly over time.
Measuring locational fundamentals in the past
Separate from any of these conceptual issues is the more fundamental issue of measurement. If it is important to “control for” or otherwise consider geographic features in empirical work, the problem for some HPE research becomes finding a way to measure these things over a long timescale. Most standard geographic measures that we have access to today—such as elevation, soil type, surface water access, or climate variability—capture current or recent conditions. This is not always a problem. For example, the location of the Rocky Mountains or the Sierra Madre range has not appreciably changed over the history of human settlement. Soil type typically changes slowly as well, though this depends on the component of soil quality under study.
The issue of timescale is more problematic for other natural features. Returning to the example of Mexico City, it is easy to download geographic data on surface water coverage (the presence of lakes, rivers, streams, etc.) from the government, but this would be a poor measure of where water resources were located in the 16th century. Plagued by chronic flooding in the early colonial period, colonial authorities undertook a massive public works project to drain the lakes at no small economic and human cost. This project, the desagüe, completely reshaped the hydrology of the Mexico City basin with important effects on water access to the present day.
Many geographic features fall somewhere in between the extreme examples of the location of the Rocky Mountains or water access in the defunct lake system around Mexico City. Consider the Food and Agriculture Organization’s Global Agro-Ecological Zones datasets. These data provide a measure of potential yields of different crops under various assumptions about technology, irrigation access, and land management strategy. Pictured above is the raster of the potential productivity of low-input, rainfed maize for North America, where the red side of the scale indicates higher potential yields and the green side lower. This is clearly not a measure of where maize is actually grown, but rather areas that are theoretically suitable for maize cultivation based on climate, soil quality, elevation, ecology, and other factors. Whether this is a good measure of past as well as present potential agricultural productivity depends on whether we believe that these climatic and geographic factors, as well as the technology for growing low-input rainfed maize, have changed over time. I think that this is probably a reasonable approximation, though there are certainly indications that climatic patterns and soil quality have changed in Mexico over the last century.
How exogenous is physical geography?
A final issue, related to the discussion of measurement above, is the assumption that geographic features are necessarily exogenous or determined independently of social or political variables. The example of Mexico City’s desagüe illustrates that humans can, and often do, reshape their environment in fundamental ways. This may be an extreme example, but the millions of smaller-scale regrading projects, irrigation projects, constructed dams, and manufactured canals or waterways have cumulatively changed the physical environment in major ways. The Central Valley of California, for example, is now an agricultural powerhouse but only became so after policymakers and engineers fundamentally changed the hydrology of the entire region. Human settlement itself can affect natural geography in subtle ways. The cultivation of maize itself can independently affect local humidity, precipitation, and temperature.
There is no perfect way to address all of these issues, but there are steps that researchers can take to mitigate potential problems. As always, there is value in thinking carefully about the determinants of our main explanatory and outcome variables and potential confounds. It can often be helpful to map and analyze key variables to examine spatial patterns, potential correlation, and potential mismeasurement. In addition to the classic geographic controls like elevation or climate, Kelly’s paper shows that it can helpful to control for continuous geographic measures like latitude and longitude to capture north-south or east-west gradients in factors that we can’t measure directly. Most importantly, it is worth thinking about how fragile or robust our findings are to various assumptions about how geography might independently shape outcomes.