Hostname: page-component-78c5997874-ndw9j Total loading time: 0 Render date: 2024-11-15T13:22:11.412Z Has data issue: false hasContentIssue false

The effect of air pollution on China's internal migration

Published online by Cambridge University Press:  06 January 2023

Wenbo Li*
Affiliation:
Ma Yinchu School of Economics, Tianjin University, Tianjin, China
*
*Corresponding author. E-mail: [email protected]
Rights & Permissions [Opens in a new window]

Abstract

Have people in China moved from more polluted cities to less polluted ones? We merge city-level air pollution data from 2003 to 2016 with migration data from a nationally representative sample. We estimate a linear model and a conditional logit model, and employ air pollution from distant sources carried by the wind as an instrument for local air pollution to address the potential concern that air pollution is endogenous to local economic activities. We make a distinction between out-migration that left some family members behind and whole-household out-migration, and discover that the former was more responsive to air pollution than the latter. The decline in net in-migration in response to an increase in air pollution was driven by both a decrease in gross in-migration and an increase in gross out-migration. We find suggestive evidence that out-migrants brought their children with them, but some aged parents were left behind.

Type
Research Article
Copyright
Copyright © The Author(s), 2023. Published by Cambridge University Press

1. Introduction

Air pollution in China is substantial and has health impacts. Among the 20 cities with the worst air quality globally in 2016, four are located in China (World Health Organization, 2016). China's increased industrial activities and rising number of vehicles are the two primary emission sources to blame. A coal-rich country, China primarily depends on coal for its electricity production and winter heating, aggravating the problem with its air quality. Air pollution is now recognized as an increasing concern impacting the cardiopulmonary health of people living in China. The 2017 Global Burden of Disease Study suggests that air pollution is the fourth leading health risk factor contributing to deaths and disability-adjusted life-years for Chinese people (Zhou et al., Reference Zhou, Wang, Zeng, Yin, Zhu, Chen, Li, Wang, Wang, Liu, Liu, Zhang, Qi, Yu, Afshin, Gakidou, Glenn, Krish, Miller-Petrie, Mountjoy-Venning, Mullany, Redford, Liu, Naghavi, Hay, Wang, Murray and Liang2019). The health impact of air pollution has also been documented in other countries, but with much lower levels of air pollution. For example, Chay and Greenstone (Reference Chay and Greenstone2003) show that a 1 per cent reduction in total suspended particulates caused a 0.35 per cent decline in infant mortality rate in the United States between 1980 and 1982.Footnote 1

An important feature of air pollution shocks is that they vary spatially across cities due to their distinct industrial composition, environmental policies, and geographic and meteorological conditions. This spatial variation creates incentives for people to migrate in order to avoid the negative impacts of these adverse shocks. There are two reasons why ignoring the effect on migration may understate the true impact of air pollution. First, if people move from more polluted cities to less polluted ones, migration will have reduced the impact of air pollution shocks on people's health outcomes. It also implies that, if residents do not have the option to migrate, they may find alternative ways to cope with air pollution, including coercing local authorities to curb it. In this way, as having the current rates of migration induced by air pollution is revealed to be preferred to having no migration, the existing migration caused by air pollution is welfare-improving compared to no migration. Second, the forced displacement of people due to air pollution may distort the population distribution across regions in China. This distortion may also depend on age, education, income, etc.

In this paper, we address the question of whether people have moved from more polluted cities to less polluted cities in China. It is unclear, a priori, whether air pollution impacts migration. On the one hand, migration allows an opportunity for migrants to accrue additional health capital. On the other hand, it is possible that air pollution does not affect people's decision-making in a developing country such as China, as the income level there is not adequately high. Thus, the effect of air pollution on migration is an empirical question.

Ideally, to answer this question empirically, we would analyze both the effect of air pollution in the origin on out-migration and the effect of air pollution in the destination on in-migration on a common set of data that: (1) documents both the origin and the destination of each migration episode, (2) records both seasonal or long-term migration, and (3) possesses a long timespan and a broad geographic span. As we do not have such a dataset, we utilize two pieces of information on migration from a nationally representative sample. Specifically, we conduct the first part of our analysis using household-level migration data that record both seasonal and long-term migration from 2014 to 2016. Simultaneously, we perform the second part of our analysis employing individual-level migration data that only record migration episodes that lasted for more than six months from 2003 to 2010. Since long-term migrants would expect to benefit from improvements in air quality longer, we expect them to be more responsive to air pollution than seasonal migrants. Thus, the second part of our analysis, which focuses on migration episodes that lasted for more than six months, could disclose a larger effect.

Methodologically, we adopt remote-sensing satellite data of the annual averages of particulate matter with a diameter of less than 2.5 micrometers ( $PM_{2.5}$) concentrations in each city. We first estimate a linear model to explore the effect of the average $PM_{2.5}$ concentrations in the origin on out-migration. Then, we estimate a conditional logit model to study the effect of air pollution on location choice. The advantage of the conditional logit model is that it captures the relative characteristics of a place, and thus allows for a role of both the origin and the destination. In both models, we employ instrumental variables (IV) strategies to address the potential concern that air pollution is endogenous to local economic activities. Part of the air pollution in an area is carried from distant sources by the wind, so we use air pollution from distant sources carried by the wind as an instrument for local air pollution. To incorporate an IV framework in the conditional logit model, we estimate the conditional logit model through generalized method of moments (GMM), following (Train, Reference Train2009: 326).

We compare the type of out-migration that left some family members behind (hereafter ‘partial out-migration’) with whole-household out-migration. We find that the former type of out-migration was more responsive to air pollution than the latter type of out-migration. In particular, a one standard deviation increase in the average $PM_{2.5}$ concentrations increased the probability of having a migrant in the household by 20 percentage points, and air pollution either did not affect or slightly increased the probability that the household moved away. We postulate the different sensitivity to air pollution by age groups to elucidate why partial out-migration was more responsive to air pollution than whole-household out-migration. It includes studying the people left behind by the out-migrants. Notably, we find that a one standard deviation increase in the average $PM_{2.5}$ concentrations reduced the probability that a child was present in the origin household by 8 percentage points, but had no effect on the probability that an elderly person was present in the origin household. Moreover, we find that migrants were less likely to choose a more polluted city, and residents were more likely to leave such a city. Thus, the decline in net in-migration as a result of an increase in air pollution was driven by both a decline in gross in-migration and an increase in gross out-migration. Furthermore, the conditional logit model predicts that a hypothetical one standard deviation increase in the average $PM_{2.5}$ concentration in each person's current location would induce 24 million people in China to leave their current locations; migration would have extended each person's life expectancy by 3.5 weeks.

Our paper contributes to the extant literature in three ways. First, our paper contrasts partial out-migration with whole-household out-migration. Existing studies have investigated the effect of air pollution on migration (Banzhaf and Walsh, Reference Banzhaf and Walsh2008; Bayer et al., Reference Bayer, Keohane and Timmins2009; Sullivan, Reference Sullivan2016; Qin and Zhu, Reference Qin and Zhu2018; Chen et al., Reference Chen, Chen, Lei and Tan-Soo2021, Reference Chen, Oliva and Zhang2022; Khanna et al., Reference Khanna, Liang, Mobarak and Song2021), and among them, Khanna et al. (Reference Khanna, Liang, Mobarak and Song2021) and Chen et al. (Reference Chen, Oliva and Zhang2022) have explored the long-term effect of air pollution on the internal migration in China. Moreover, Li and Zhang (Reference Li and Zhang2019), Luo et al. (Reference Luo, Yang and Chen2019), Sun et al. (Reference Sun, Zhang and Zheng2019), and Wang et al. (Reference Wang, Ma, Zhang and Wang2021) have protracted this discussion in Chinese. Nevertheless, on the one hand, Khanna et al. (Reference Khanna, Liang, Mobarak and Song2021) define out-migration as leaving one's Hukou (or Household Registration System) city, and Luo et al. (Reference Luo, Yang and Chen2019) similarly define out-migration as applying for a patent from a different city than one previously did; these definitions of out-migration consider both partial and whole-household out-migration. On the other hand, Wang et al. (Reference Wang, Ma, Zhang and Wang2021) specify households of short-term out-migrants as households with minimal electricity consumption, a definition that covers only whole-household out-migration. Comparing partial out-migration and whole-household out-migration is critical, because family separation can impose psychological costs on the migrants and the family members left behind. In principle, partial out-migration could respond to air pollution more than whole-household out-migration. This is because the strong attachment to their origin would render partial out-migrants more likely to be circular migrants. To the extent that circular migrants had more information regarding a reduction in air pollution in their origins, the partial migrants in our sample could be more sensitive to the air pollution in their origins. Circular migrants constitute a substantial fraction of migrants in China (Hu et al., Reference Hu, Xu and Chen2011). Hence, comparing partial out-migration and whole-household out-migration is unique and empirically crucial.

Second, while (Graves and Waldman, Reference Graves and Waldman1991; Sun et al., Reference Sun, Zhang and Zheng2019; Chen et al., Reference Chen, Oliva and Zhang2022) study heterogeneity by age groups, we expand the scope of this analysis to comprise children under the age of 15, an age group that is potentially more vulnerable to air pollution. This inclusion is critical, because the Hukou system in China makes it exceedingly costly for migrants to bring their children with them, but at the same time, parents may want to invest in their children's health capital. Thus, parents may place more weight on the air pollution in their origin when they decide whether or not to bring their children with them. In addition, unlike previous studies conducted at the county level (Graves and Waldman, Reference Graves and Waldman1991; Chen et al., Reference Chen, Oliva and Zhang2022) or at the individual level (Sun et al., Reference Sun, Zhang and Zheng2019), the corresponding part of our analysis is conducted at the household level. Thus, to the extent that migration decisions of family members were jointly made by a household, our approach accounts for the bias caused by the fact that some households had more than one child and the possibility that these households might respond more or less to air pollution than households with an only child. We find that air pollution in the origin reduced the probability of a child's presence in the origin household, but did not affect the probability of an elderly person's presence in the origin household. To that end, air pollution would have changed the spatial distribution of the population by age through migration. This change impacts the long-term wellbeing of the origin cities and the provision of public goods in the destination.

Third, our paper differentiated whether air pollution caused out-migration or whether it ‘only’ determined a destination once an individual had already decided to migrate. This distinction discloses whether air pollution caused gross out-migration or whether it ‘only’ caused gross in-migration. The extant literature posits that an increase in air pollution in a city caused a decline in net in-migration (Bayer et al., Reference Bayer, Keohane and Timmins2009; Khanna et al., Reference Khanna, Liang, Mobarak and Song2021), an increase in net out-migration (Chen et al., Reference Chen, Oliva and Zhang2022), and a decline in gross in-migration (Li and Zhang, Reference Li and Zhang2019; Sun et al., Reference Sun, Zhang and Zheng2019; Chen et al., Reference Chen, Oliva and Zhang2022), but these net and gross population flows could be driven exclusively by a decline in gross in-migration with no accompanying increase in gross out-migration, if air pollution did not cause out-migration, but only determined a destination once an individual had already decided to migrate. This is possible if the preferences of individuals changed when they acquired more information once they had decided to leave a city. This is also possible if individuals only considered the levels of air pollution in their destination relative to those in their origin; in particular, individuals became accustomed to the levels of air pollution in their current city, but moving to a more polluted city required costly adaptation, so they would only move to an equally or less polluted city. Making the distinction between gross out-migration and gross in-migration is vital, because workers might accrue city-specific human capital over time, and only if air pollution caused gross out-migration do cities need to improve their air quality in order to retain talent.Footnote 2 In this paper, we make this distinction in two ways. First, we directly observe out-migration in the survey data we adopt, although in most existing studies, including Chen et al. (Reference Chen, Oliva and Zhang2022), this is not the case. Second, in the conditional logit model, our paper explores whether a person chose his/her current city again as an additional source of heterogeneity. We find in both cases that air pollution did cause gross out-migration, so the net and gross population flows in response to an increase in air pollution as observed in the existing literature were driven by both a decline in gross in-migration and an increase in gross out-migration.

The rest of the paper is organized as follows. Section 2 describes the data and section 3 illustrates the empirical strategy. Section 4 reports the results and section 5 concludes. We provide the background of the analysis, including discussions on the Hukou system, a comparison between the causes and consequences of partial out-migration and those of whole-household out-migration, the overall trend of air pollution in China, and the choice of $PM_{2.5}$ as our air pollution measure, in the online appendix.

2. Data

2.1. Air quality

Existing studies have documented the lack of reliability regarding the Air Pollution Index data from land-based monitoring stations officially published by the Ministry of Environmental Protection of China from 2000 to 2013 (Andrews, Reference Andrews2008; Chen et al., Reference Chen, Jin, Kumar and Shi2012; Ghanem and Zhang, Reference Ghanem and Zhang2014), and have instead demonstrated that remote-sensing satellite data can measure air pollution relatively well (Kumar et al., Reference Kumar, Chu, Foster, Peters and Willis2011). For this reason, we adopt the remote-sensing satellite data developed by Van Donkelaar et al. (Reference Van Donkelaar, Martin, Li and Burnett2019) and Hammer et al. (Reference Hammer, van Donkelaar, Li, Lyapustin, Sayer, Hsu, Levy, Garay, Kalashnikova, Kahn, Brauer, Apte, Henze, Zhang, Zhang, Ford, Pierce and Martin2020) to measure $PM_{2.5}$ concentrations.

Remote-sensing satellite data depend primarily on aerosol optical depth (AOD), which measures surface reflectances. In the dataset we adopt, the AOD is retrieved from radiances measured by four satellite instruments: twin MODIS (MODerate Resolution Imaging Spectroradiometer) instruments, which are onboard the polar-orbiting Terra and Aqua satellites and have provided daily global coverage; the MISR (Multiangle Imaging Spectroradiometer) instrument, which is onboard the Terra satellite and have provided global coverage once a week; and the SeaWiFS (Sea-Viewing Wide Field-of-View Sensor) instrument, which was onboard the SeaStar satellite and offered daily global coverage. Van Donkelaar et al. (Reference Van Donkelaar, Martin, Li and Burnett2019) and Hammer et al. (Reference Hammer, van Donkelaar, Li, Lyapustin, Sayer, Hsu, Levy, Garay, Kalashnikova, Kahn, Brauer, Apte, Henze, Zhang, Zhang, Ford, Pierce and Martin2020) apply a chemical transport model, whose simulation is driven by meteorological data, to determine surface $PM_{2.5}$ concentrations using AOD and other inputs such as land-based monitoring station readings. The calculated $PM_{2.5}$ concentrations are highly consistent with out-of-sample cross-validated land-based monitoring station readings with an $R^{2}$ of 0.90-0.92 and a slope of 0.90-0.97. $PM_{2.5}$ concentrations are aggregated to the city level by taking a simple average of $PM_{2.5}$ concentrations within the city boundary. Our adopted data are the annual averages of city-level $PM_{2.5}$ concentrations from 2003 to 2016 for all 370 cities in China. Over this period, average $PM_{2.5}$ concentrations ranged from 2 $\mu {\rm g}/{\rm m}^{3}$ to 101 $\mu {\rm g}/{\rm m}^{3}$, and had a mean of 39 $\mu {\rm g}/{\rm m}^{3}$ and a standard deviation of 20 $\mu {\rm g}/{\rm m}^{3}$.Footnote 3

2.2. CLDS

We use the information on migration from the China Labor-Force Dynamics Survey (CLDS).Footnote 4 The CLDS comprises a nationally representative sample of individuals, households, and districts/counties from 29 provinces and provincial-level municipalities. The survey adopts a multistage and stratified sampling strategy, with the 2282 districts/counties in the 29 provinces and provincial-level municipalities being the primary sampling units. The villages/neighborhoods in the randomly selected districts/counties are randomly divided into four groups, with each group being nationally-representative. In each wave of every two years, one of the four groups rotates out and another group rotates in.

We employ the 2014 and 2016 household-level CLDS sample to investigate the effect of air pollution in the origin on out-migration. We specify two out-migration types. In the first type, some family members were left behind. We derive the migration status of each family member from the survey question: ‘Is this person currently living at home?’ We describe people not living at home as migrants exclusively if they went away long-term for work. Among the people living away from home, only those living in a district/county different from their home are deemed migrants.Footnote 5 The essential information regarding the family members away from home, such as the migrants, was gathered from a family member living at home during the survey. Because this type of out-migration occurred and was recorded in the 2014 and 2016 sample households, we use the panel component of the 2014 and 2016 CLDS, i.e., 7744 households that were interviewed in both survey years. Fig. A3 in the online appendix depicts the locations of cities represented by this panel. As reported in table 1, households with migrants, on average, had more children and less income compared to households without migrants. On average, the household heads in the households with migrants were also older, less educated, and more likely to have rural Hukou.

Table 1. Summary statistics

Notes: Standard deviations are in parentheses; standard errors are in square brackets.

In merging the air quality data with the CLDS, we measure the air quality that each CLDS household experienced in each wave of the survey by the average $PM_{2.5}$ concentration of that calendar year of the city where the household resided. Nevertheless, no household could respond to the air quality after the interview dates in that calendar year. Since most CLDS surveys were conducted between July and August of each survey year, we collect air quality data for the month between May 13th and June 12th of each survey year, and let their average be an alternative measure of air quality before the interview dates in each survey year. In the online appendix, we discuss the robustness of our results by adopting this alternative measure of air quality.

Whole-household migration is the second type of out-migration. Since the CLDS only conducted face-to-face interviews at the location where a household was first selected, the households that moved away entirely did not appear in subsequent waves of the survey. Thus, the households that moved away entirely from 2014 to 2016 were only observed in the 2014 CLDS but not in the 2016 CLDS. In each survey year, a quarter of the CLDS households rotated out of the sample by design, and the rotation was performed at the village/neighborhood level.Footnote 6 Since the CLDS includes the village/neighborhood identifier, we can identify whether a household rotated out in 2016 by determining whether the village/neighborhood was unobserved in the 2016 CLDS. Using this method, we find that, from the 2014 CLDS, 24.8 per cent of the households rotated out in 2016. Another 20.3 per cent of households in the 2014 CLDS dropped out in 2016, because the CLDS could not track these households, and we consider these households as having experienced whole-household migration. Nevertheless, it is possible that elderly people who were alive in 2014 were more likely to die by 2016 given an increase in air pollution, and we might mistake households that consisted only of left-behind elderly people in 2014 and disappeared from the sample in 2016 due to the death of the elderly people for households that had whole-household out-migrated. In the online appendix, we discuss the potential bias resulting from this likelihood, and offer a bound for this bias.

To study the effect of air pollution on location choice, we utilize the full sample of 23,594 individuals from the 2014 CLDS. In particular, the individual-level data of the 2014 CLDS comprise the complete migration history of each individual in the sample. The migration history includes the destination city, the year, and the primary reason for each migration episode. We employ this piece of information to retrospectively construct a panel of individual location choices for each year from 2003 to 2010. Most people (98 per cent) never migrated from 2003 to 2010; 2 per cent of individuals at least once; 0.3 per cent of individuals more than once. For individuals who moved more than once in a given year, we let their final location after the last move be their location in that year.

2.3. Control variables

Air pollution is endogenous to local economic activities. Besides the IV strategy to be discussed in the next section, we control for per capita GDP, gross industrial output, and the unemployment rate, three socio-economic variables derived from the China City Statistical Yearbook, in our linear model. Furthermore, since weather conditions may determine air pollution and independently affect location choice, as a robustness check in the online appendix, we also control for the annual averages of mean, maximum, and minimum temperature, dew point, precipitation, and wind speed, all collected from the National Oceanic and Atmospheric Administration.

3. Methods

3.1. Out-migration

In the first part of our analysis, we estimate a linear probability model to study the effect of air pollution in the origin on out-migration. For partial out-migration, we estimate the following equation on the panel component of the 2014 and 2016 CLDS households,

(1)\begin{equation} SentMigrant_{ict}=\beta_{1}AveragePM2.5_{ct}+\beta_{2}X_{ct}+ \beta_{3}W_{ict}+\mu_{i}+\delta_{t}+\epsilon_{ict}\end{equation}

where $i$ stands for a household, $c$ stands for the city, and $t$ stands for the year. $SentMigrant_{ict}$ is an indicator for having a migrant in household $i$ in year $t$. $AveragePM2.5_{ct}$ is the average $PM_{2.5}$ concentration in year $t$ of city $c$ in which household $i$ was located. $X_{ct}$ is the city-level socio-economic controls, including per capita GDP, gross industrial output, and the unemployment rate. $W_{ict}$ is the household-level controls, including age, years of education, and Hukou of the household head, as well as total family income. $\mu _{i}$ is the household fixed effect; $\delta _{t}$ is the year fixed effect, with 2014 being the base year. $\epsilon _{ict}$ is the error term. Since the CLDS does not track the households that moved between 2014 and 2016, all households in the panel of the 2014 and 2016 CLDS households were stationary between these two survey years. Thus, the household fixed effect also removes the location-specific time-invariant characteristics. Standard errors are clustered at the household level to allow for serial correlation in the error term.

For whole-household migration, we estimate the following equation on a cross-section of the 2014 CLDS households that were not due to rotate out in 2016,

(2)\begin{equation} OutMigrate_{ict}=\beta_{0}+\beta_{1}\Delta PM2.5_{ct}+\beta_{2}\Delta X_{ct}+\epsilon_{ict}\end{equation}

where $OutMigrate_{ict}$ is an indicator taking the value of one, if household $i$ migrated out between 2014 and 2016; $\Delta PM2.5_{ct}$ is the change in average $PM_{2.5}$ concentrations of city $c$ between 2014 and 2016; and $\Delta X_{ct}$ is the change in city-level socio-economic controls between 2014 and 2016.

3.2. Identification strategy

We aim to address a potential source of endogeneity: air pollution is endogenous to local economic activities. Many unexplained economic factors impact both air pollution and migration. For example, a city experiencing a negative economic shock and closing its polluting factories may have a decline in air pollution but an outflow of workers previously employed by the factories. This potential source of endogeneity can cause the ordinary least squares (OLS) coefficient estimates for the average $PM_{2.5}$ concentrations in equation (1) and the change in average $PM_{2.5}$ concentrations in equation (2) to be downward biased.

To address this concern, we instrument for the average $PM_{2.5}$ concentrations using the air pollution from distant sources carried by the wind. Bayer et al. (Reference Bayer, Keohane and Timmins2009) were the first to employ air pollution from distant sources as an instrument for local air pollution. They use a detailed source-receptor matrix developed for the United States Environmental Protection Agency that relates emissions from nearly 6000 sources to particulate matter with a diameter of less than 10 micrometers ($PM_{10}$) in each county in the U.S. to determine the marginal willingness to pay for clean air in the U.S. Using this matrix, they can calculate how much the pollution sources more than 80 km away from a county contributed to the $PM_{10}$ levels in that county. Zheng et al. (Reference Zheng, Cao, Kahn and Sun2014) and Barwick et al. (Reference Barwick, Li, Rao and Zahur2018) later employ a similar IV strategy based on air pollution from distant sources, the former to study the long-term effect of air pollution on China's housing prices, and the latter to study the short-term effect of air pollution on healthcare expenditure in China. Since the same source-receptor matrix used in Bayer et al. (Reference Bayer, Keohane and Timmins2009) does not exist in China, we adopt the formulation of the instrument from Zheng et al. (Reference Zheng, Cao, Kahn and Sun2014),

(3)\begin{equation} NEIGHBOR_{it}=\sum_{j}{w_{ij}\cdot smoke\:emission_{jt}\cdot e^{{-}d_{ij}}} ,\;d_{ij}>80\,km, \end{equation}

where $NEIGHBOR_{it}$ is air pollution from distant sources for receiving city $i$ in year $t$, and thus the instrument; $w_{ij}$ is a dummy variable taking the value of one if source city $j$ is located in the prevailing wind direction of receiving city $i$; $smoke\:emission_{jt}$ is city $j$'s emission level in year $t$; $d_{ij}$ is the distance between city $i$ and city $j$; $e^{-d_{ij}}$ is the value of a continuous and exponential decreasing function, and hence the weight declines as the distance between city $j$ and city $i$ increases.

In constructing this instrument, we conduct the following procedure. First, we obtain data on the prevailing wind direction, defined as the most frequent wind direction from 1981 to 2010, at each of all 1156 monitoring stations across China from the China Meteorological Data Service Center. Wind could take 16 different directions, with each direction spanning 22.5 degrees. When two or more wind monitoring stations exist in a receiving city, and the prevailing wind directions differ, we take the most common prevailing wind direction to be the prevailing wind direction of this receiving city. In the online appendix, we provide a falsification test by rotating this prevailing wind direction clockwise by 90 degrees. Second, we gather data on the soot (or dust) emission levels for 290 Chinese cities in each year from 2003 to 2010, 2014, and 2016 from the China City Statistical Yearbook. The amount of soot (or dust) emitted by a city in a given year represents the emission level of that city in that year. Third, we measure the distance $d_{ij}$ by the great circle distance given in degrees of longitude, and determine the direction of one city toward another city using the Haversine formula.

The remaining step is to choose the exclusion distance within which the emissions do not count toward air pollution from distant sources. An ideal distance would correlate the instrument with local $PM_{2.5}$ concentrations but not with local economic activities. Increasing this distance would weaken both correlations, and decreasing this distance would strengthen both. To select the exclusion distance that makes the instrument satisfy both relevance and the exclusion restriction, we summarize the correlation between air quality measures, including instruments using 50 km, 80 km, and 120 km as the exclusion distances, and observable local economic activities variables collected from the China City Statistical Yearbook in table A1 of the online appendix. One of the observable local economic activities variables, gross industrial output, is correlated with average $PM_{2.5}$ concentrations, but no observable local economic activities variable is correlated with air pollution from distant sources. We choose 80 km as the exclusion distance in our baseline estimates, because it is the safest and because the first stage is still strong. This choice of 80 km as the exclusion distance is consistent with that in Bayer et al. (Reference Bayer, Keohane and Timmins2009). We disclose that our results are robust to choosing alternative exclusion distances in the online appendix.

The crucial assumption behind this identification strategy suggests that local economic activities do not affect air pollution emissions beyond 80 km, but local economic activities are allowed to affect air pollution emissions within 80 km. It implies that the economic activities of a potential destination within 80 km may be correlated with both local air pollution and out-migration, and thus pose a challenge for meeting the exclusion restriction. To address this concern, we calculate the migration distance of each of the 536 migration episodes in the individual-level retrospective panel from 2003 to 2010. The migration distance has a mean of 5.4 degrees (590 km), a median of 3.3 degrees (360 km), and a mode of 1 degree (110 km). Due to confidentiality concerns of the CLDS, we can only identify the migration distance from the origin city to the destination one, and not at the more granular county level. Nevertheless, we can still exclude migration episodes to the nearest city as the most dominant type of migration. As a result, there is little reason to suggest that an individual's migration motives were substantially driven by the economic activities of a potential destination within 80 km.

Table 2 reports the first stage estimated on the panel component of the 2014 and 2016 household-level CLDS. The first stage is strong, with an F-statistic of 844. The average $PM_{2.5}$ concentrations and air pollution from distant sources are both standardized to z-scores with a mean of zero and a standard deviation of one. As expected, average $PM_{2.5}$ concentrations were increasing in air pollution from distant sources. On average, a one standard deviation increase in air pollution from distant sources was associated with a 0.14 standard deviation increase in the average $PM_{2.5}$ concentrations. Thus, around 14 per cent of local air pollution came from distant sources.

Table 2. The effect of air pollution in the origin on partial out-migration

Notes: The IV regression uses air pollution from distant sources as the instrument for average $PM_{2.5}$ concentrations. The average $PM_{2.5}$ concentrations and air pollution from distant sources are normalized to z-scores. All regressions have per capita GDP, gross industrial output, and the unemployment rate as city-level controls and age, years of education, and Hukou of the household head, as well as total family income as household-level controls. Both column (2) and column (3) include household fixed effect and year fixed effect. All regressions apply sampling weights. Standard errors are clustered at the household level. Standard errors are in parentheses.

3.3. Location choice

For the second part of our analysis, we estimate a conditional logit model (McFadden, Reference McFadden1974) to explore the effect of air pollution on location choice. This part of our analysis allows us to exploit the long retrospective timespan of the $PM_{2.5}$ data. The advantage of the choice model is that it captures the relative characteristics of a place, thus allowing for a role of both the origin and the destination. With this model, the identification comes from the revealed preference of the individuals over locations with varied levels of air pollution. That is, an observation is at the individual level, and the levels of air pollution in the cities an individual chose or did not choose play a more prominent role than which city the individual finally chose.

A set of 124 cities was represented by the final locations of individuals in the 2014 CLDS sample, but a more extensive set of 214 cities was epitomized by the previous locations of these individuals, as these individuals might have come from cities outside of the former set. Whether an individual previously chose outside the former set of 124 cities is less important in the conditional logit model than it would be in a gravity-type model, where the migration effect of air pollution is identified off pairwise city-level migration flows. For this reason, we opt for the conditional logit model instead of a gravity-type model. The conditional logit model assumes that the error term has i.i.d. type-I extreme value distribution. Because the error terms are assumed to be independent, the model also assumes independence of irrelevant alternatives. In particular, the error terms for close-by locations are assumed to be uncorrelated with one another. We estimate the following equation:

(4)\begin{equation} \begin{aligned} U_{ijt} & =\alpha LocationPM2.5_{ijt}+\beta Current_{ijt}+\gamma Current_{ijt}\\ & \quad\times LocationPM2.5_{ijt}+\phi Distance_{ijt}+\nu _{ijt}, \end{aligned} \end{equation}

where $U_{ijt}$ is individual $i$'s utility of choosing city $j$ in year $t$; $LocationPM2.5_{ijt}$ is the average $PM_{2.5}$ concentration, standardized to z-score, of city $j$ in year $t$; $Current_{ijt}$ is a dummy for whether individual $i$ was located in city $j$ in year $t-1$; $Distance_{ijt}$ is the distance in degrees of longitude between city $j$ and the city where individual $i$ was located in year $t-1$; and $\nu _{ijt}$ is the error term.

A marginal effect is the product of a coefficient estimate, the probability of choosing the city in which a change occurred, and the probability of choosing the destination. When multiplied by the probability of choosing a destination where an individual was not located the year before, and one minus this probability, $\alpha$ corresponds to the effect of the average $PM_{2.5}$ concentration in the destination on the probability that the city was chosen that year, and thus the effect of air pollution in the destination on in-migration. Similarly, $\beta$ relates to (and has the same sign as) the probability that an individual stayed where he/she was the year before. $\alpha +\gamma$ relates to (and has the same sign as) the effect of the average $PM_{2.5}$ concentration in a city where an individual currently lived on the probability that the person stayed where he/she was. Since staying is the opposite of leaving, the product of $\alpha +\gamma$, the probability of choosing the origin, and one minus this probability, becomes the additive inverse of the effect of the average $PM_{2.5}$ concentrations in the origin on out-migration. We control for the distance between the potential destination and the city where an individual currently lived as a proxy for migration cost. We expect that $\alpha <0$, $\beta >0$, and $\phi <0$. It is possible that an individual knew better about the air quality in his/her current location, and the uncertainty regarding the air pollution in the destination implies that a migrant might not always end up in a less polluted city. Hence, an individual might be more responsive to the air pollution in his/her current location; it would imply $\gamma <0$.

Migration can be viewed as a two-step process: First, an individual decides to leave a city, and second, that individual chooses the place to go. The factors influencing the individual's decision to leave a city may not be the same as those influencing individual's decision on where to go. The existing literature has found that an increase in air pollution in a city causes a decline in net in-migration (Bayer et al., Reference Bayer, Keohane and Timmins2009; Khanna et al., Reference Khanna, Liang, Mobarak and Song2021), an increase in net out-migration (Chen et al., Reference Chen, Oliva and Zhang2022), and a decline in gross in-migration (Li and Zhang, Reference Li and Zhang2019; Sun et al., Reference Sun, Zhang and Zheng2019; Chen et al., Reference Chen, Oliva and Zhang2022), but net population flows are a product of gross population inflows and outflows. If an increase in air pollution led to a decline in gross in-migration, it would imply a decline in net in-migration, as observed in the literature.

Nevertheless, it does not necessarily imply that gross out-migration has increased in response to the increase in air pollution. This is because, even if air pollution determined a destination once an individual had already decided to migrate, air pollution might not have caused out-migration. This is possible if the preferences of an individual changed when he/she acquired more information after having decided to leave a city. This is also possible if an individual only considered the levels of air pollution in his/her destination relative to those in his/her origin; in particular, an individual became accustomed to the air pollution in his/her current city, but moving to a more polluted city demanded costly adaptation, so the individual would only move to an equally or less polluted city.

The inclusion of the interaction term between $Current_{ijt}$ and $LocationPM2.5_{ijt}$ enables us to determine whether a decline in net in-migration in response to an increase in air pollution was caused by both a decline in gross in-migration and an increase in gross out-migration, and contributes to the existing literature. On the one hand, if $\alpha <0$ and $\alpha +\gamma <0$, the decline in net in-migration in response to an increase in air pollution would have been driven by both a decline in gross in-migration and an increase in gross out-migration. On the other hand, if $\alpha <0$ but $\alpha +\gamma =0$, the decline in net in-migration would have been driven exclusively by a decline in gross in-migration, and no increase in gross out-migration would occur. In this case, air pollution did not cause out-migration, but only determined a destination once an individual had already decided to migrate.

Since the average $PM_{2.5}$ concentrations are endogenous to local economic activities, $\hat {\alpha }$ may be upward-biased (in contrast to the downward bias in the linear model on the analysis of out-migration). In addition, since an individual may be more familiar with the economic opportunities of his/her current location, $\hat {\gamma }$ may also be upward-biased. To instrument for $LocationPM2.5_{ijt}$ and $Current_{ijt}\times LocationPM2.5_{ijt}$, we estimate equation (4) via GMM. Following (Train, Reference Train2009: 326), we derive the following moment condition,

\[ \sum_{i,t}{\sum_{j}{(Y_{ijt}-\mathbb{P}(Y_{ijt}=1|X_{ijt}))Z_{ijt}=0,}} \]

where $Y_{ijt}$ is 1 if individual $i$ was located in city $j$ in year $t$; $\mathbb {P}(Y_{ijt}=1|X_{ijt})$ is the conditional probability that individual $i$ was located in city $j$ in year $t$; $X_{ijt}$ are the regressors in equation (4) including $LocationPM2.5_{ijt}$, $Current_{ijt}$, $Current_{ijt}\times LocationPM2.5_{ijt}$, and $Distance_{ijt}$; and $Z_{ijt}$ is the instrument. The moment condition has an intuitive construct: the observed mean of the instrument (i.e., $\sum _{i,t}{\sum _{j}{ Y_{ijt}Z_{ijt}}}$) equals the mean predicted by the model (i.e., $\sum _{i,t}{ \sum _{j}{\mathbb {P}(Y_{ijt}=1|X_{ijt})Z_{ijt}}}$). Under the assumption that the error term has i.i.d. type-I extreme value distribution, $\mathbb {P} (Y_{ijt}=1|X_{ijt})$ has a closed-form solution (McFadden, Reference McFadden1974):

\[ \mathbb{P}(Y_{ijt}=1|X_{ijt})=\frac{e^{\pi ^{\prime }X_{ijt}}}{\sum_{k}{ e^{\pi ^{\prime }X_{ikt}}}}. \]

As in the analysis of out-migration, we instrument for $LocationPM2.5_{ijt}$ using air pollution from distant sources, and instrument for $Current_{ijt}\times LocationPM2.5_{ijt}$ using the interaction term between $Current_{ijt}$ and air pollution from distant sources, while $Current_{ijt}$ and $Distance_{ijt}$ serve as their own instruments. Thus, the model is exactly identified.

To compare the results from the conditional logit model with the previous results from the linear model in Sec. 3.1, we translate the coefficient estimates into marginal effects. The effect of the average $PM_{2.5}$ concentration in a city where an individual currently lived on the probability that he/she stayed there is conveniently given as:

(5)\begin{equation} MarginalEffect=(\alpha +\gamma )\mathbb{P}(Y_{ijt}=1|X_{ijt})[1-\mathbb{P} (Y_{ijt}=1|X_{ijt})].\end{equation}

This marginal effect depends on $\mathbb {P}(Y_{ijt}=1|X_{ijt})$, i.e., the probability that individual $i$ chose his/her current city $j$ again in year $t$. The additive inverse of this marginal effect is the effect of air pollution in the origin on out-migration, comparable to the effect in the linear model in Sec. 3.1.

4. Results

4.1. Out-migration

Table 2 reports the results for the effect of the average $PM_{2.5}$ concentrations in the origin on the probability that a household had a migrant. The OLS result in column (1) suggests that $PM_{2.5}$ concentrations in the origin were not associated with the probability of having a migrant. A downward bias may be expected, because local economic activities were positively correlated with air pollution but negatively correlated with partial out-migration. For example, a city experiencing a negative economic shock and closing its polluting factories might experience decreased air pollution but an outflow of workers previously employed by the factories. Column (2) shows the result after adding household and year fixed effects. The fixed effects absorb all the household-specific time-invariant and time-specific household-invariant characteristics, and partially correct the potential downward bias of the coefficient estimate by absorbing the between-city variations in unobserved local economic activities.

Column (3) reports the IV result. The IV regression also includes household fixed effect and year fixed effect. Furthermore, it corrects this downward bias caused by the within-city and within-year changes in unobserved local economic activities that were correlated with both changes in local air pollution and changes in partial out-migration. The coefficient estimate implies that a one standard deviation increase in the average $PM_{2.5}$ concentrations increased the probability of having a migrant by 20 percentage points. Given that the mean probability of having a migrant is 28 per cent, the 20 percentage point increase, as suggested by the IV result, is sizable.

It is noteworthy, however, that this interpretation refers to one standard deviation of the levels in air pollution and not one standard deviation of the changes in air pollution, so a one standard deviation increase in air pollution is very large. In fact, given the two-way fixed effects setup, the change in air pollution of the 118 cities in our sample from 2014 to 2016 relative to the overall changes in air pollution across cities ranges from -0.62 to 0.78 standard deviation of air pollution levels. Only 11 per cent of the sample cities experienced a change in partial out-migration relative to the overall changes in partial out-migration greater than 10 percentage points due to changes in air pollution from 2014 to 2016. Thus, the magnitude of the effect we identify is not unreasonably large.

In China, a county is a smaller geographical unit than a city, with a city typically comprising several districts and counties. Given that the sampling of the CLDS is conducted at the district/county level, our coefficient of interest has the interpretation of the effect of air pollution in the origin city on partial out-migration. For confidentiality reasons, the CLDS does not provide any geographic identifier within a city, thus precluding an analysis at a more granular level. Since county-level air pollution may be measured with an error by the air pollution in the city where the county is located, the classical errors-in-variables model predicts a potential attenuation bias. In particular, the effect we estimate would be biased toward zero.

Nevertheless, the magnitude of the effects we find is comparable to that in the existing literature. Given that the average household in the CLDS sample had 4.2 people, and since each household with out-migrants had 2.2 out-migrants on average, the 20 percentage point increase in the probability of having a migrant approximately means that the population declined through partial out-migration by 11 per cent in a city with a one standard deviation increase in $PM_{2.5}$ concentrations. Chen et al. (Reference Chen, Oliva and Zhang2022) disclose that a one standard deviation increase in $PM_{2.5}$ concentrations decreased the population in any given county through net out-migration by 15 per cent. Thus, the effect we find is similar in magnitude to that of Chen et al. (Reference Chen, Oliva and Zhang2022).

Table 3 reports the results for the effect of the average $PM_{2.5}$ concentrations in the origin on the probability that a household moved away entirely. We estimate these regressions on a cross section of the 2014 CLDS households not due to rotate out in 2016, and thus do not include any fixed effect. Column (1) shows the OLS results, which suggest no effect of changes in air pollution on whole-household out-migration. A downward bias is similarly expected, since local economic activities are correlated with both air pollution in the origin and whole-household out-migration. Column (2) demonstrates the IV results with changes in air pollution from distant sources as the instrument. The coefficient estimate is positive, but still statistically insignificant, indicating that whole-household out-migration did not respond to air pollution.

Table 3. The effect of air pollution in the origin on whole-household out-migration

Notes: Both regressions are estimated on a cross-section of the 2014 CLDS households not due to rotate out in 2016, and use changes in air pollution from distant sources from 2014 to 2016 as the instrument for changes in average $PM_{2.5}$ concentrations from 2014 to 2016. The average $PM_{2.5}$ concentrations and air pollution from distant sources are normalized to z-scores. Both regressions have the changes in per capita GDP, gross industrial output, and the unemployment rate as controls. Neither column includes any fixed effect. All regressions apply sampling weights. Standard errors are in parentheses.

Two reasons might explain why partial out-migration could be more responsive to air pollution than whole-household migration. First, the strong attachment to their origin would make partial out-migrants more likely to be circular migrants, and circular migrants might possess more information concerning a reduction in air pollution in their origins. Second, people of different ages benefit differently from moving away from air pollution. For example, due to the restrictions on migration under the Hukou system, China's migrants are known to leave their children behind. Nevertheless, children are more vulnerable to air pollution, and parents may want to invest in their children's health capital. Thus, migrating parents may place more weight on the air pollution in their origin when deciding whether or not to bring their children with them. At the same time, older people might be more used to the changes in air pollution in their current locations, and thus might stay behind despite an increase in air pollution.

Table 4 tests this latter hypothesis. Although we do not observe the location where each migrant lived, we follow the age of each member of the origin household. Thus, we estimate equation (1) with a child or an elderly person being present in the origin household as the dependent variables. We find that a one standard deviation increase in the average $PM_{2.5}$ concentrations reduced the probability that a child was present in the origin household by 8 percentage points. Against a mean of 43 per cent, this translates into a 19 per cent increase, and is very sizable. In contrast, air pollution in the origin did not change the probability that an elderly person was present in the origin household. Thus, at least some of the pollution-induced migrants brought their children with them, but some aging parents were left behind.

Table 4. The effect of air pollution in the origin - who were left behind

Notes: A child is defined as a person under 18 years of age, and an elderly person is defined as a person over 60 years of age. Both regressions use air pollution from distant sources as the instrument for average $PM_{2.5}$ concentrations. The average $PM_{2.5}$ concentrations and air pollution from distant sources are normalized to z-scores. Both regressions have per capita GDP, gross industrial output, and the unemployment rate as city-level controls and age, years of education, and Hukou of the household head, as well as total family income as household-level controls. All regressions include household fixed effect and year fixed effect. Standard errors are clustered at the household level. Standard errors are in parentheses.

While Graves and Waldman (Reference Graves and Waldman1991), Sun et al. (Reference Sun, Zhang and Zheng2019), and Chen et al. (Reference Chen, Oliva and Zhang2022) study heterogeneity by age groups, we expand the scope of this analysis to include children under the age of 15, an age group that is potentially more vulnerable to air pollution. Our results align with the finding of Chen et al. (Reference Chen, Oliva and Zhang2022) that young working-age individuals were more responsive to air pollution, but differ from the finding of Sun et al. (Reference Sun, Zhang and Zheng2019) that the sensitivity to air pollution increased with age. This discrepancy may be because the analysis in Sun et al. (Reference Sun, Zhang and Zheng2019) was conducted at the individual level, while ours is at the household level. It is possible that air pollution only induced one elderly person who was more vulnerable to air pollution within a household to migrate, but his/her spouse stayed behind.

4.2. Location choice

Table 5 reports the results from estimating the conditional logit model in equation (4). Because the marginal effects depend on the probability of choosing the city where a change in a covariate occurred as well as the probability of choosing the destination, we report the more succinct coefficient estimates instead. The coefficient estimates are informative by themselves because, given a change in a covariate in a city, the signs of the coefficient estimates align with the signs of the marginal effects on choosing a destination.

Table 5. Coefficient estimates of conditional logit model

Notes: The sample is a retrospective panel from 2003 to 2010 constructed from the 2014 individual-level CLDS. Each year, the individuals chose among 214 cities, determined by the cities that all individuals in the sample chose across all sample years. Location $PM_{2.5}$ and the instrument, air pollution from distant sources, are standardized to z-scores. Standard errors are in parentheses.

Column (1) demonstrates the results estimated via maximum likelihood estimation (MLE) without instrumenting for $LocationPM2.5_{ijt}$ and $Current_{ijt}\times LocationPM2.5_{ijt}$. We find that the coefficient estimate for $LocationPM2.5_{ijt}$ is negative and statistically significant, indicating that migrants not currently living in a city were less likely to choose the city if it had more air pollution. The coefficient estimate for $Current_{ijt}$ is positive and statistically significant, suggesting that most people stayed where they were the year before. The coefficient estimate of $Current_{ijt}\times LocationPM2.5_{ijt}$ is positive and significant. It means that air pollution had a weaker deterrent effect in the current city, and can be due to the endogeneity created by $LocationPM2.5_{ijt}$; in particular, a person might be more familiar with the economic opportunities of his/her current location, making him/her appear less sensitive to the air pollution there. The coefficient estimate of $Distance_{ijt}$ is negative and statistically significant, indicating that people were less likely to choose a faraway city, which presumably involved higher migration costs.

We then estimate the conditional logit model via GMM. Column (2) reports the results without instrumenting for $LocationPM2.5_{ijt}$ and $Current_{ijt}\times LocationPM2.5_{ijt}$. The coefficient estimates we obtain via GMM are the same as those obtained via MLE. The standard errors are different because, as in most cases, the GMM estimator is not efficient. Column (3) reports the results after we instrument for $LocationPM2.5_{ijt}$ with air pollution from distant sources and $Current_{ijt}\times LocationPM2.5_{ijt}$ with the interaction term between $Current_{ijt}$ and air pollution from distant sources. The coefficient estimate for $LocationPM2.5_{ijt}$ is slightly more negative, suggesting that a slight upward bias will exist if we do not instrument for $LocationPM2.5_{ijt}$. The upward bias is expected, because local economic activities were positively correlated with both air pollution and in-migration. The negative coefficient estimate suggests that more air pollution in the destination did lead to less gross in-migration.

The coefficient estimate for the interaction term, $Current_{ijt}\times LocationPM2.5_{ijt}$, is also more negative, albeit statistically insignificant. It implies that, without instrumenting for $Current_{ijt}\times LocationPM2.5_{ijt}$, an upward bias may exist, because the person might be more familiar with the economic opportunities of his/her current location. Furthermore, if air pollution did not cause out-migration but ‘only’ determined a destination once an individual had already decided to migrate, we should only observe a negative effect of air pollution in the destination on gross in-migration (i.e., $\hat {\alpha }<0$), and we should not observe a positive effect of air pollution in the origin on gross out-migration (i.e., $\hat {\alpha }+\hat {\gamma }=0$). Nevertheless, our estimates suggest that $\hat {\alpha }+\hat {\gamma }<0$, and the p-value for testing the hypothesis that $\hat {\alpha }+\hat {\gamma }=0$ is 0.04. This implies that more air pollution in the origin resulted in more gross out-migration. Moreover, figure A4 in the online appendix compares the effect of air pollution in the origin on out-migration between the linear model and the conditional logit model, and finds that the magnitudes are similar.

In addition, the ratio between the coefficient estimate for $LocationPM2.5_{ijt}$ and that for $Distance_{ijt}$ is three. Since the ratio between parameters is independent of scale, we interpret this ratio as the distance that an individual was willing to sacrifice to avoid a one standard deviation increase in location $PM_{2.5}$ concentrations. Thus, to relocate to a city with one standard deviation less air pollution, an individual was willing to migrate 3 degrees farther away, which is roughly equivalent to 320 km, or around a quarter of the distance between Beijing and Shanghai.

4.3. Counterfactual

To illustrate the role migration played in reducing the exposure to air pollution, we use the conditional logit model to predict the number of out-migrants when the average $PM_{2.5}$ concentration of everyone's current location hypothetically increased by one standard deviation (or 18.6 $\mu {\rm g}/{\rm m}^{3}$), a large shock. Notably, whether we increase a city's air pollution is person-specific. For example, if Person 1 was located in City A, for Person 1, we would only increase City A's air pollution and not City B's, even though City B was someone else's current location. To conduct the simulation, we perform the following procedure. We simulate choices using observable data, increase the average $PM_{2.5}$ concentration of everyone's current location by one standard deviation, and then simulate migration. After we scale the predicted number of migrants up by the population of China, we estimate that the increase in air pollution would induce 24 million migrants to move away from their current cities.

This simulation also predicts the change in each individual's exposure to air pollution. We calculate that the observed level of migration would have reduced each person's exposure to the increase in the average $PM_{2.5}$ concentrations by 0.06 standard deviation, or 6 per cent of the presumed increase in the average $PM_{2.5}$ concentration of the person's current location. Since a 10 $\mu {\rm g}/{\rm m}^{3}$ increase in $PM_{2.5}$ concentrations has been shown to decrease life expectancy by 0.61 year (Pope III et al., Reference Pope III, Ezzati and Dockery2009), we calculate that migration would have extended each person's life expectancy by 0.07 year, or 3.5 weeks.

5. Conclusions

In this paper, we have shown that migration could mitigate the health impacts of air pollution. It implies that, if residents did not have the option to migrate, they would choose alternative ways to cope with air pollution, such as wearing face masks, purchasing air filters, or coercing local authorities to curb air pollution. These alternatives might pose a higher cost to the residents and society. Our study is consistent with the existing literature in suggesting that improving its air quality can allow a city to attract more migrants, but our findings also reveal three other conclusions with policy implications.

First, we have contrasted partial out-migration with whole-household out-migration. We found that a one standard deviation increase in the average $PM_{2.5}$ concentrations increased the probability of having a migrant by 20 percentage points, and air pollution either did not affect or slightly increased the probability that the household moved away. Nevertheless, our results are suggestive at best. On the one hand, as shown in the online appendix, the coefficient estimate for partial out-migration is only statistically significant at the 10 per cent level, when standard errors are clustered at the city level. The rise in standard errors is presumably due to the lack of within-city-year variation in air pollution. On the other hand, as discussed in the online appendix, in constructing our air pollution measure, we assume that all households responded to the air pollution in the same time period, but the air pollution each household experienced may depend on the date on which the household was interviewed. This measurement error would bias the coefficient estimate for air pollution and the t-statistic towards zero, so the true migration response to air pollution could be larger than what we estimate.

Despite these shortcomings, our results for partial out-migration align with Chen et al. (Reference Chen, Oliva and Zhang2022), but we have also suggested the differential response to air pollution by age as a potential reason why partial out-migration was more responsive to air pollution than whole-household out-migration. In particular, we have found that a one standard deviation increase in the average $PM_{2.5}$ concentrations reduced the probability that a child was present in the origin household by 8 percentage points. That is, air pollution might have affected the household decision to bring the children with the migrants. In this way, our results suggest that air pollution can change the spatial distribution of the population by age through migration. Areas with more air pollution would experience an outflow not only of working-age adults but also of children, who in turn might have less attachment to their origin when they came of age. In contrast, areas with less air pollution would experience an influx of children and growing pressure to provide them with local public goods, including education and healthcare. Since air pollution did not affect the probability of an elderly person being present in the origin household, air pollution might have caused family separation through partial migration, which might have imposed psychological costs on the migrants and the family members left behind.

Second, we have found that air pollution not only determined a destination once an individual had already decided to migrate, but also caused out-migration. That is, the decline in net in-migration as a result of an increase in air pollution was driven by both a decline in gross in-migration and an increase in gross out-migration. This implies that cities wishing to retain talent can do so by improving their air quality.

Finally, even though we cannot predict whether people would respond more to air pollution if the government lowered the barriers to migration, our findings indicate that the existing migration caused by air pollution is welfare-improving compared to having no one move. Specifically, the conditional logit model predicts that a hypothetical one standard deviation increase in the average $PM_{2.5}$ concentration in each person's current location would induce 24 million people in China to move away from their current locations. Based on the model's prediction, the observed level of migration would reduce 6 per cent of each person's exposure to this increase in air pollution.

To sum up, we conclude that the current level of migration due to air pollution was beneficial to residents compared to having no one move, but air pollution also caused some damages to families or regions through migration. This observation reinforces the importance of controlling air pollution in regions where it is the most severe. The extent to which a country accomplishes this goal will determine the role migration plays in distributing environmental amenities for individuals and promoting sustainable development for regions across the country.

Supplementary material

The supplementary material for this article can be found at https://doi.org/10.1017/S1355770X22000377

Acknowledgments

The author thanks Abigail Wozniak, Daniel Hungerman and Christopher Cronin for their guidance and support; and seminar participants of the University of Notre Dame and Tianjin University, conference participants of the Australasian Meeting of the Econometric Society and the AASLE conference, and two anonymous referees, for useful comments.

Conflict of interest

The author declares none.

Footnotes

1 In 2003, the annual average $PM_{2.5}$ concentration in the U.S. was 12 $\mu {\rm g}/{\rm m}^{3}$ (US Environmental Protection Agency, 2004), compared to 36 $\mu {\rm g}/{\rm m}^{3}$ in China calculated using our sample data, which will be discussed in Sec. 2.

2 See, for example, Topel (Reference Topel1991) for the related accumulation of job-specific human capital.

3 Figures A1 and A2 in the online appendix demonstrate the spatial distribution of annual average $PM_{2.5}$ concentrations across cities in China in 2014 and 2016, respectively.

4 Data used in this paper are from the CLDS conducted by the Center for Social Science Survey at Sun Yat-sen University in Guangzhou, China. The opinions are the author's alone. Readers can refer to http://css.sysu.edu.cn for more information about the CLDS data.

5 Although the CLDS records whether an individual lived in the same district/county as their home, it does not provide any information on whether the individual lived in the same city as their home.

6 There are almost always 35 households in each village/neighborhood. For each attrited household not due to rotate out, the CLDS supplements the household with another household randomly selected from the same village/neighborhood.

References

Andrews, SQ (2008) Inconsistencies in air quality metrics: ‘blue sky’ days and PM10 concentrations in Beijing. Environmental Research Letters 3, 034009.CrossRefGoogle Scholar
Banzhaf, HS and Walsh, RP (2008) Do people vote with their feet? An empirical test of Tiebout. American Economic Review 98, 843863.CrossRefGoogle Scholar
Barwick, PJ, Li, S, Rao, D and Zahur, NB (2018) The Healthcare Cost of Air Pollution: Evidence from the World's Largest Payment Network. Technical report, National Bureau of Economic Research.CrossRefGoogle Scholar
Bayer, P, Keohane, N and Timmins, C (2009) Migration and hedonic valuation: the case of air quality. Journal of Environmental Economics and Management 58, 114.CrossRefGoogle Scholar
Chay, KY and Greenstone, M (2003) The impact of air pollution on infant mortality: evidence from geographic variation in pollution shocks induced by a recession. The Quarterly Journal of Economics 118, 11211167.CrossRefGoogle Scholar
Chen, S, Chen, Y, Lei, Z and Tan-Soo, J-S (2021) Chasing clean air: pollution-induced travels in China. Journal of the Association of Environmental and Resource Economists 8, 5989.CrossRefGoogle Scholar
Chen, S, Oliva, P and Zhang, P (2022) The effect of air pollution on migration: evidence from China. Journal of Development Economics 15, 102833.CrossRefGoogle Scholar
Chen, Y, Jin, GZ, Kumar, N and Shi, G (2012) Gaming in air pollution data? Lessons from China. The BE Journal of Economic Analysis & Policy 13, 3227. doi:10.1515/1935-1682.3227Google Scholar
Ghanem, D and Zhang, J (2014) ‘Effortless perfection’: do Chinese cities manipulate air pollution data?. Journal of Environmental Economics and Management 68, 203225.CrossRefGoogle Scholar
Graves, PE and Waldman, DM (1991) Multimarket amenity compensation and the behavior of the elderly. American Economic Review 81, 13741381.Google Scholar
Hammer, MS, van Donkelaar, A, Li, C, Lyapustin, A, Sayer, AM, Hsu, NC, Levy, RC, Garay, MJ, Kalashnikova, OV, Kahn, RA, Brauer, M, Apte, JS, Henze, DK, Zhang, L, Zhang, Q, Ford, B, Pierce, JR and Martin, RV (2020) Global estimates and long-term trends of fine particulate matter concentrations (1998–2018). Environmental Science & Technology 54, 78797890.CrossRefGoogle ScholarPubMed
Hu, F, Xu, Z and Chen, Y (2011) Circular migration, or permanent stay? Evidence from China's rural–urban migration. China Economic Review 22, 6474.CrossRefGoogle Scholar
Khanna, G, Liang, W, Mobarak, AM and Song, R (2021) The Productivity Consequences of Pollution-Induced Migration in China. Technical report, National Bureau of Economic Research.CrossRefGoogle Scholar
Kumar, N, Chu, AD, Foster, AD, Peters, T and Willis, R (2011) Satellite remote sensing for developing time and space resolved estimates of ambient particulate in Cleveland, OH. Aerosol Science and Technology 45, 10901108.CrossRefGoogle ScholarPubMed
Li, M and Zhang, Y (2019) The migration effect of air pollution – a study based on the choice of university cities for international students in China. Economic Research Journal 54, 168182.Google Scholar
Luo, Y, Yang, J and Chen, S (2019) Air pollution, human capital flow and innovative vitality - evidence from individual patent inventions. China Industrial Economics 10, 99117.Google Scholar
McFadden, D (1974) Conditional logit analysis of qualitative choice behavior. In Zarembka P (ed.), Frontiers in Econometrics. New York: Academic Press, pp. 105–142.Google Scholar
Pope III, CA, Ezzati, M and Dockery, DW (2009) Fine-particulate air pollution and life expectancy in the United States. New England Journal of Medicine 360, 376386.CrossRefGoogle Scholar
Qin, Y and Zhu, H (2018) Run away? Air pollution and emigration interests in China. Journal of Population Economics 31, 235266.CrossRefGoogle Scholar
Sullivan, DM (2016) Residential sorting and the incidence of local public goods: theory and evidence from air pollution. Resources for the Future Working Paper.Google Scholar
Sun, W, Zhang, X and Zheng, S (2019) Air pollution and spatial mobility of labor force: study on the migrant's job location choice. Economic Research Journal 54, 102117.Google Scholar
Topel, R (1991) Specific capital, mobility, and wages: wages rise with job seniority. Journal of Political Economy 99, 145176.CrossRefGoogle Scholar
Train, KE (2009) Discrete Choice Methods with Simulation. Cambridge, UK: Cambridge University Press.Google Scholar
US Environmental Protection Agency (2004) The particle pollution report. Current understanding of air quality and emissions through 2003. US Environmental Protection Agency. Research Triangle Park, NC.Google Scholar
Van Donkelaar, A, Martin, RV, Li, C and Burnett, RT (2019) Regional estimates of chemical composition of fine particulate matter using a combined geoscience-statistical method with information from satellites, models, and monitors. Environmental Science & Technology 53, 25952611.CrossRefGoogle ScholarPubMed
Wang, Z, Ma, J, Zhang, B and Wang, B (2021) Air pollution and residential migration: empirical evidence from smart meter data. Management World 3, 1933.Google Scholar
World Health Organization (2016) Ambient (Outdoor) Air Pollution Database. Geneva, Switzerland: WHO.Google Scholar
Zheng, S, Cao, J, Kahn, ME and Sun, C (2014) Real estate valuation and cross- boundary air pollution externalities: evidence from Chinese cities. The Journal of Real Estate Finance and Economics 48, 398414.CrossRefGoogle Scholar
Zhou, M, Wang, H, Zeng, X, Yin, P, Zhu, J, Chen, W, Li, X, Wang, L, Wang, L, Liu, Y, Liu, J, Zhang, M, Qi, J, Yu, S, Afshin, A, Gakidou, E, Glenn, S, Krish, VS, Miller-Petrie, MK, Mountjoy-Venning, WC, Mullany, EC, Redford, SB, Liu, H, Naghavi, M, Hay, SI, Wang, L, Murray, CJL and Liang, X (2019) Mortality, morbidity, and risk factors in China and its provinces, 1990–2017: a systematic analysis for the global burden of disease study 2017. The Lancet 394, 11451158.CrossRefGoogle Scholar
Figure 0

Table 1. Summary statistics

Figure 1

Table 2. The effect of air pollution in the origin on partial out-migration

Figure 2

Table 3. The effect of air pollution in the origin on whole-household out-migration

Figure 3

Table 4. The effect of air pollution in the origin - who were left behind

Figure 4

Table 5. Coefficient estimates of conditional logit model

Supplementary material: PDF

Li supplementary material

Online Appendix

Download Li supplementary material(PDF)
PDF 1.2 MB