Skip to main content

Emulating causal dose-response relations between air pollutants and mortality in the Medicare population

Abstract

Background

Fine particulate matter (PM2.5), ozone (O3), and nitrogen dioxide (NO2) are major air pollutants that pose considerable threats to human health. However, what has been mostly missing in air pollution epidemiology is causal dose-response (D-R) relations between those exposures and mortality. Such causal D-R relations can provide profound implications in predicting health impact at a target level of air pollution concentration.

Methods

Using national Medicare cohort during 2000–2016, we simultaneously emulated causal D-R relations between chronic exposures to fine particulate matter (PM2.5), ozone (O3), and nitrogen dioxide (NO2) and all-cause mortality. To relax the contentious assumptions of inverse probability weighting for continuous exposures, including distributional form of the exposure and heteroscedasticity, we proposed a decile binning approach which divided each exposure into ten equal-sized groups by deciles, treated the lowest decile group as reference, and estimated the effects for the other groups. Binning continuous exposures also makes the inverse probability weights robust against outliers.

Results

Assuming the causal framework was valid, we found that higher levels of PM2.5, O3, and NO2 were causally associated with greater risk of mortality and that PM2.5 posed the greatest risk. For PM2.5, the relative risk (RR) of mortality monotonically increased from the 2nd (RR, 1.022; 95% confidence interval [CI], 1.018–1.025) to the 10th decile group (RR, 1.207; 95% CI, 1.203–1.210); for O3, the RR increased from the 2nd (RR, 1.050; 95% CI, 1.047–1.053) to the 9th decile group (RR, 1.107; 95% CI, 1.104–1.110); for NO2, the DR curve wiggled at low levels and started rising from the 6th (RR, 1.005; 95% CI, 1.002–1.018) till the highest decile group (RR, 1.024; 95% CI, 1.021–1.027).

Conclusions

This study provided more robust evidence of the causal relations between air pollution exposures and mortality. The emulated causal D-R relations provided significant implications for reviewing the national air quality standards, as they inferred the number of potential early deaths prevented if air pollutants were reduced to specific levels; for example, lowering each air pollutant concentration from the 70th to 60th percentiles would prevent 65,935 early deaths per year.

Peer Review reports

Introduction

Fine particulate matter (PM2.5), ozone (O3), and nitrogen dioxide (NO2) are major air pollutants that pose considerable threats to human health [1, 2]. Starting in the 1990s, a large literature of epidemiological research has reported associations between chronic air pollution exposures and mortality, with PM2.5 and O3 being the most extensively studied components [3,4,5,6,7,8,9]. Chronic exposure to NO2 has also been associated with mortality, although the evidence is relatively scarce [10, 11]. These findings provide important implications for understanding the health burden attributable to poor air quality. In the United States, it is estimated that each 1 μg·m− 3 increase in PM2.5 concentration is associated with over 30,000 deaths each year, equivalent to a loss of 0.13–0.15 years in national life expectancy [12].

The primary objective of epidemiology is to identify a causal connection between exposure and health outcome, thereby informing decisions on policy interventions [13]. For example, the United States Environmental Protection Agency (US EPA) reviews the National Ambient Air Quality Standards (NAAQS) periodically based on the cause-effect relationship that can be inferred from the best available science [14]. However, as observational studies, many air pollution epidemiological investigations, by nature, have been associational rather than causal [15]. Although a growing literature has examined the long-term effect of PM2.5 on mortality using the formal causal modeling techniques, there is so far little evidence for O3 and NO2 [16,17,18]. Indeed, O3 and NO2 have received less attention than they deserve; so far there is no standard for long-term O3 concentrations (only daily) and the standard for annual NO2 concentrations has remained the same for decades [19].

What has been mostly missing in air pollution epidemiology is the specific shapes of causal dose-response (D-R) relations between air pollution exposures and risk of mortality. Such causal D-R relations can provide profound implications in predicting the health impact at a target level of air pollution concentration [20]. Recently, a study of PM10 that explicitly used a formal causal modeling approach to estimate the D-R relationship found a higher mortality risk at low to moderate air pollution levels [21]. However, to date no such studies have been done for PM2.5, O3 or NO2. Specifying the causal D-R relationship, especially at very low levels, is critically important in measuring the risk of mortality induced directly by the change of air pollution level, thus supporting the potential revision of NAAQS in the US and, globally, the World Health Organization air quality guidelines [19, 22].

The present study analyzed 74 million Medicare beneficiaries in the contiguous US with 637 million person-years of follow-up from 2000 to 2016, which covers more than 95% of elders aged 65 years and older in the US who are considered to be most susceptible to air pollution [23]. The Medicare population also accounts for two-thirds of total mortality, allowing us to analyze most deaths induced by air pollution [1, 6]. By linking the annual averages of ambient PM2.5 and NO2 concentrations as well as warm-season (April–September) average of ambient O3 to the ZIP Codes of beneficiaries’ residence, we were able to have proxy measures of chronic exposures for each individual [24]. We proposed a decile binning approach which divided each exposure by deciles and predicted the inverse probability of being assigned to the observed group for each observation, adjusting for the other two concurrent exposures, personal characteristics, meteorological, socioeconomic, behavioral, and medical access variables, and long-term time trend. If propensity score models were correctly specified, we had constructed a valid counterfactual framework and thus estimated the causal D-R relations between chronic exposures to PM2.5, O3, and NO2 and the risk of all-cause mortality.

Methods

Mortality data

We obtained Medicare enrollment records for beneficiaries aged 65 years and above residing in the contiguous US between 2000 and 2016 from the Centers for Medicare and Medicaid Services, with all-cause mortality as the study outcome. For each beneficiary, we extracted their demographic information (sex, race, age at initial enrollment), Medicaid eligibility, ZIP Code of residence, year of initial enrollment, and year of death if it occurred during the study period. We constructed an open cohort with person-years of follow-up in which each beneficiary was followed each year from the study entry until the end of study, drop out of the cohort, or death, whichever occurred earliest. Note that the same data format has been used to fit time-varying Cox proportional hazard models [6].

Exposure assessment

The daily concentrations of ambient PM2.5, O3, and NO2 at 1 km × 1 km grid cells across the contiguous US were predicted and validated using hybrid models that ensembled predictions from random forest, gradient boosting, and neural network. Multiple predictor variables were incorporated in the predictions, including ground monitoring data, satellite data, meteorological conditions, land-use variables, and chemical transport model simulations, etc., with details published elsewhere [25,26,27].

These high-resolution predictions at 1 km × 1 km grid cells allow us to estimated ZIP Code-level exposure levels with a high degree of accuracy, with annual R2 on held out monitors of 0.89 for PM2.5, 0.86 for O3, and 0.84 for NO2. There are two major types of ZIP Codes in the US: standard ZIP Code and PO Box. Because a standard ZIP Code represents a delivery area, we used the polygon layer generated by Environmental Systems Research Institute (Esri) [28], and estimated the ZIP Code’s daily concentrations by averaging the predictions at grid cells whose centroid points were inside the polygon of that ZIP Code. For PO Box, because it is used only for a given facility and therefore can be represented by a single point, we estimated its daily concentrations by linking it to the nearest grid cell.

The exposures of interest were assessed based on the ZIP Code-level estimates. For PM2.5 and NO2, we defined their chronic exposures as annual average concentrations. For O3, following previous literature [5, 6], we defined the chronic exposure as the average concentration during warm season (April–September) of the year. We assigned the chronic exposures to PM2.5, O3, and NO2 to each person-year based on that person’s ZIP Code of residence and calendar year.

Covariate information

Meteorological variables including daily air temperature and humidity at 2 m above the ground were extracted from Phase 2 of the North American Land Data Assimilation System, with 12 km × 12 km resolution across the continental US [29]. The average temperatures during warm (April–September) and cold seasons (January–March plus October–December) of each year were calculated from the daily data because both exceedingly low and high temperatures were physically stressful and were also associated with air pollution levels [30, 31]. Annual average humidity was calculated from the daily data on a yearly basis. ZIP Code Tabulation Areas (ZCTA)-level socioeconomic variables, including the percentage of Blacks, percentage of Hispanics, median household income, median value of owner occupied housing, percentage of Americans aged 65 and older living below the poverty threshold, percentage of Americans with less than high school education, percentage of owner occupied housing units, and population density, etc., were obtained from 2000 and 2010 US Census and the American Community Survey [32]. These variables were linearly extrapolated by year to account for the time varying nature of socioeconomic status. County-level behavioral variables, including body mass index (BMI) and percentage of ever smokers for each year, were obtained from the Behavioral Risk Factor Surveillance System [33]. From the Dartmouth Atlas of Health Care [34], we obtained percentage of Medicare participants who had a hemoglobin A1c test, a low-density lipoprotein cholesterol (LDLC) test, a mammogram, and an eye exam to a primary care physician for each year in each hospital catchment area in the US and assigned it to all ZCTAs in that area. We also computed the distance from each ZIP Code centroid to the nearest hospital. These variables were linked to each person-year by ZIP Code of residence and calendar year. Summary statistics of the covariates are provided in Section 5 of Supplementary Information.

A decile binning approach to emulate causal D-R relations

To emulate the causal D-R relationship, we need a counterfactual framework. For a binary exposure, the causal estimate in a population of interest comes from the difference between the counterfactual outcome under which all the members of the population had been exposed versus the counterfactual outcome had they not been exposed, thus no confounding occurs [35]. In randomized experiments, counterfactuals are constructed by randomly assigning individuals to treatment groups to ensure that exposure is independent of all potential confounders. In observational studies, however, the exposure assignment is not random but instead is considered to be influenced by subject characteristics, and causal methods seek ways to approximate counterfactuals with reference to the observed population [36]. Inverse probability weighting (IPW), for example, is a formal causal modeling technique and is increasingly being used in observational studies [37, 38]. For a binary exposure, it uses quasi-experimental design to construct a “pseudo-population” by weighting the population by the inverse probability of the observed exposure given all measured confounders. The “pseudo-population” is then used to estimate the exposure effect. If the systematic difference of characteristics among the exposed and unexposed is adequately adjusted so that the two groups are comparable with respect to any confounders, a causal conclusion is warranted [35, 39].

But air pollution exposure is continuous in nature. Estimating the inverse probability weights in the continuous setting is challenging as it needs to 1) correctly specify the distributional form of exposure, 2) deal with non-constant variance (heteroscedasticity), and 3) avoid excessively large or small weights for outliers that are more likely to occur [40, 41]. For these reasons, in this section we proposed a decile binning approach to emulate the causal D-R relations between chronic air pollution exposures and mortality by dividing each exposure into ten equal-sized groups by deciles, treating the lowest decile group (i.e., 10% of the study population with the lowest exposure levels) as the reference, and estimating the effects for the other groups compared to the reference. This relaxes the strong assumptions of distribution form and homoscedasticity for continuous exposure by relying solely on deciles. In addition, binning data makes the inverse probability weights robust against outliers [42].

We had a dataset with person-year representations of follow-up which allowed for time-varying exposures and covariates. To reduce the computational burden, first we aggregated the person-years with the same sex, race, age, Medicaid eligibility, living in the same ZIP Code of residence and in the same year. We treated them as a single record because those person-years had identical values for all exposures and covariates and thus could be treated interchangeably in the analysis. As a result, we retained all the information yet significantly reduced the size of the data in which each observation represented a stratum of combination of sex, race, age, and Medicaid eligibility per ZIP Code of residence per year. Numbers of deaths and person-years were cumulated for each stratum.

For each exposure, the analysis under the counterfactual framework was composed of two stages: a design stage where a randomized “pseudo-population” was constructed by weighting the observed population by the inverse probability of exposure given all measured confounders, and an analysis stage where the treatment effect was estimated among the constructed “pseudo-population” [43]. In the first stage, we binned the exposure into ten equal-sized categories based on deciles. The stabilized inverse probability weight (swij) for stratum j in exposure category i was defined as:

$$ {sw}_{ij}=\frac{P\left(X\in i\right)}{expit\left(g\left({X}_{ij};{n}_{ij}\ |\ \boldsymbol{C}\right)\right)} $$

where P(Xi) denotes the probability of any observed exposure X being in group i, which equals to 0.1; expit(∙) denotes the inverse logistic link function where \( expit(x)=\frac{\mathit{\exp}(x)}{1+\mathit{\exp}(x)} \); and g(∙) the gradient boosting machine (GBM) model with logistic loss function for predicting the probability of the observed categorized exposure Xij given the set of confounders C, weighted by nij, the number of person-years aggregated in the stratum. The use of GBM for estimating the probability of observed exposure has demonstrated a better predictive accuracy compared to the conventional logistic regression, as it captures nonlinearity and interactions of confounders and is unaffected by the potential autocorrelation [44, 45]. The confounder set C includes the other two concurrent exposures, calendar year, the individual-level variables (sex, race, 5-year age group, and Medicaid eligibility), and the area-level meteorological, socioeconomic, behavioral, and medical access variables as detailed in the previous section. The numerator, P(Xi), is used to stabilize the variability of weights to avoid excessively upweighting or downweighting observations [40].

In the second stage of the analysis, for each exposure, we fitted a log linear regression relating the number of deaths and factored exposure category, weighted by the stabilized inverse probabilities estimated from the first stage. We used quasi-Poisson link function to account for overdispersion in the number of death, and included an offset of the number of person-years to account for the different population size at risk in each stratum. As a result, we obtained the marginal effect of each decile group on mortality. If the model for estimating the stabilized inverse probability weight is correctly specified, we have achieved an unbiased estimator of the causal effect for each group [41].

The results are expressed as the relative risks (RR) of mortality for higher decile groups against the lowest-decile group (reference). The number of early deaths avoided by lowering air pollutant concentration of a higher to lower decile group can be calculated as \( N{\alpha}_0\left(\frac{RR_{high}-1}{RR_{high}}-\frac{RR_{low}-1}{RR_{low}}\right) \), where N is the annual averaged number of person-years, α0 is the baseline annual mortality rate, and RRhigh and RRlow are the relative risks of mortality for the higher decile group and the lower decile group, respectively. More details are provided in Section 4 of Supplementary Information.

We assessed the robustness of the causal dose-response relations between the chronic air pollution exposures and mortality risk by conducting sensitivity analysis of splitting each exposure into 14 bins.

Results

The demographic characteristics of the national Medicare cohort during 2000–2016 were summarized in Table 1. The cohort included 74,537,533 Medicare beneficiaries with a total of 637,207,589 person-years of follow-up. The average follow-up time was 8.5 years. Among them 30,209,831 deaths occurred, accounting for 40.5% of the population. The cohort comprised more females (55.4%), mostly whites (84.0%), and mostly aged 65–74 years when entering the cohort (78.4%). Over 13 million beneficiaries ever enrolled in the Medicaid program, accounting for 18.5% of the population.

Table 1 Demographic characteristics of Medicare cohort, 2000–2016

Maps of the contiguous US with annual PM2.5, warm-season (April–September) O3, and annual NO2 concentrations at ZIP Codes of the Medicare beneficiaries’ residence in 2016 are presented in Fig. 1. The PM2.5 concentration was higher in most central and eastern states and the Central Valley of California, and was lower in the northeast US and mountainous region. The warm-season O3 concentration was highest in the mountainous region and California. The NO2 concentration was higher in populous cities and along major highways. Over the years 2000–2016, the annual PM2.5 concentration at ZIP Codes averaged at 9.85 μg·m− 3, the warm-season O3 averaged at 39.34 ppb, and the annual NO2 averaged at 17.30 ppb (Table 2).

Fig. 1
figure1

Maps of the contiguous US with annual PM2.5, warm-season O3, and annual NO2 concentrations at ZIP Code level in 2016

Table 2 Summary statistics for annual PM2.5, warm-season O3, and annual NO2 concentrations, 2000–2016

Figure 2 presents the estimated causal D-R relations between chronic exposures to PM2.5, O3, NO2 and the RR of mortality. The exposure concentration corresponding to each effect estimate represents the average concentration within the decile group. The dose-response relationship between chronic exposure to PM2.5 and mortality was monotonic and approximately linear, with higher concentration levels associated with greater risk of mortality. Specifically, the RRs of mortality associated with chronic exposure to PM2.5 ranged from 1.022 [95% confidence interval (CI), 1.018–1.025] at 6.60 μg·m− 3 (the 2nd decile group) to 1.207 (95% CI, 1.203–1.210) at 15.47 μg·m− 3 (the 10th decile group). For O3, the risk of mortality monotonically increased from the 2nd (RR, 1.050; 95% CI, 1.047–1.053) to the 9th decile group (RR, 1.107; 95% CI, 1.104–1.110), and dropped at the highest decile group (RR, 1.044; 95% CI, 1.041–1.048). For NO2, the dose-response curve wiggled at low levels and started rising from the 6th decile group (RR, 1.005; 95% CI, 1.002–1.018) till the highest decile group (RR, 1.024; 95% CI, 1.021–1.027). Importantly, the risk of mortality associated with chronic PM2.5 exposure was substantially larger than those with O3 and NO2; the highest RR for PM2.5 was greater than those for O3 and NO2. The entire dose-response relationship for NO2 occurred at concentrations below the national standard of 53 ppb, and most of the PM2.5 relationship was also below the standard of 12 μg·m− 3 [19]. There is no long-term standard for O3. All the numerical results are provided in Section 1 of Supplementary Information.

Fig. 2
figure2

Causal dose-response relations between chronic exposures to PM2.5, O3, NO2 and the relative risk of mortality. The relative risks of mortality and 95% CIs are shown for higher decile groups against the lowest decile group (reference). The exposure concentration corresponding to each effect estimate represent the average concentration within the decile group. Vertical lines represent current national air quality standards for annual PM2.5 (12 μg·m−3) and annual NO2 (53 ppb). There is no long-term standard for O3

The dose-response relations remained robust after splitting each exposure into 14 bins, with details provided in Section 2 and Section 3 of Supplementary Information.

Discussion

We proposed a decile binning approach to simultaneously emulate the D-R relations between chronic exposures to major air pollutants and mortality in a general and susceptible older population. Assuming that the IPW models were correctly specified and the counterfactual framework was valid, the D-R curves revealed that in general, higher levels of PM2.5, O3, and NO2 were causally associated with a greater risk of mortality. Compared with previous associational D-R curves [3, 5,6,7], the causal D-R curves essentially infer the number of potential lives saved if air pollution concentrations were reduced to targeting levels. For example, lowering each air pollutant concentration from the 70th to 60th percentiles would prevent 65,935 early deaths among elders per year (Section 4 of Supplementary Information), and this is a substantial public health benefit.

A major advance of the present study is that we simultaneously evaluated PM2.5, O3, and NO2, which allowed us to mutually adjust their confounding and also to directly compare their health impacts. We found that PM2.5 had a substantially larger effect on mortality than O3 and NO2. The finding confirmed previously published results suggesting that PM2.5 is the most deadly air pollutant and that chronic exposure to PM2.5 is of greater public health concern [18]. The increasing patterns of the D-R relations for PM2.5 and NO2 at levels below the current NAAQS suggest the necessity of more stringent national air quality standards for the protection of public health. Currently the NAAQS lack regulation for long-term O3, and clearly the daily standard has not reduced the warm-season average to concentrations with no mortality association [46]. Our results support the argument for establishing a warm-season O3 standard. The lower risk of mortality for the highest decile group for O3 may suggest that the O3 effect was represented by traffic exhausts such as nitrogen oxides and volatile organic compounds, as they play important roles in O3 actions and are highly reactive at extreme levels. However, further investigations are needed to address this question.

The causal conclusions of this study depend on the key assumption of correct IPW model specification. The validity of this assumption is not testable and relies on outside information. To minimize confounding bias, we adjusted for any known possible confounders such as concurrent air pollutants, Medicaid eligibility (proxy for individual’s low socioeconomic status), and seasonal temperatures and humidity (important physical stressors and determinants of air pollution [30]), etc. We also adjusted for area-level confounders of socioeconomic status, ethnicity, smoking status, obesity, population density, access to medical care, and calendar year (to capture unmeasured confounders that had a temporal scale of variation). In predicting the inverse probability of being assigned to the observed exposure decile given the set of confounders, GBM adaptively captured any nonlinearity and interactions and was unaffected by the potential autocorrelation [44]. Although residual confounding can never be ruled out, the consistent dose-response relationships for PM2.5 and O3 obtained across different study designs and populations provide some reassurance that our causal estimates are not substantially biased [3, 5,6,7].

As we have noted, the proposed decile binning approach relaxed assumptions on data distribution and homoscedasticity when constructing inverse probability weights for continuous exposures. Ambient air pollution concentrations usually follow a heteroscedastic distribution possibly with long tails, which results in excessively upweighted observations. To fix this issue, Naimi et al. proposed a quantile binning approach where he estimated weights for binned exposure and then treated those bins as continuous and linear, and found it outperformed other IPW estimators with various parametric forms of the exposure distribution [42]. Adopting his idea, our approach further relaxed the linear assumption by categorizing bins and comparing the effect of each bin to the reference group. If the assumption of correct IPW model specification holds, the estimated effect of each bin is an asymptotically unbiased estimator of the true causal effect [41]. Further, the estimand of our interest, the marginal effect estimates, do not depend upon the distributions of confounders and have arguably greater public health relevance because many confounders might not be measurable at decision time. The marginal effect estimates are also more useful when depicting dose-response relationship for the purpose of understanding the total effect [20].

Assigning ambient air pollution concentrations at ZIP Codes as a marker of individual exposure levels may result in measurement error. Although measuring more personal exposures can overcome the limitation, it also introduces confounding that are difficult to control such as personal behaviors, which may affect personal exposure measurement directly but not affect ambient air pollution estimation. In addition, personal exposure measurements can be compromised by the study outcome and thus is also more vulnerable to reverse causation [24]. For example, patients who die from chronic obstructive pulmonary disease (COPD) generally spent less time outdoors [47]; because ambient air pollutants are filtered by the building envelop and deposit on indoor surfaces, there are lower concentrations of those ambient pollutants indoors [48]. Hence, those patients have lower levels of personal exposures. By contrast, under the null assumption, COPD mortality is not associated with ambient concentration predictions, which are more proxy exposure measurement than the personal measurement. In epidemiology studies, ideally the measure of exposure should be as accurate as possible. In practice, however, this is usually not possible and the issue is to choose an appropriate exposure metric that balances the biological relevance, interpretability, and implications for public health policy. While using a proxy measure for air pollution exposure increases measurement error, it also brings important advantages for causal inference.

Some limitations must be acknowledged. First, we were not able to examine on cause-specific mortality which is not available for the Medicare data. Further studies investigating which major specific causes are driving the death would provide a valuable addition. Second, spatial confounding inherent to proximity-based air pollution measurements could still be present given that ZIP Code was the finest geographical unit we could use to link air pollution levels with each beneficiary. Third, restricted by available data sources, we could not adjust for individual behavior and medical history because such information was not available for the Medicare enrollment data, which may contribute to residual confounding. Fourth, although air pollution levels were estimated from models with excellent out-of-sample prediction ability, they are not perfect and therefore may attenuate effect estimates [24].

Conclusions

In summary, this study simultaneously emulated D-R curves between chronic exposures to PM2.5, O3, NO2 and all-cause mortality among the national Medicare cohort during 2000–2016. We proposed a decile binning approach to relax the contentious assumptions of conventional IPW estimators, which yielded more robust causal evidence on adverse effects of air pollution exposure on mortality. Assuming that the IPW models were correctly specified, the estimated D-R curves reveal that in general, higher levels of chronic PM2.5, O3, and NO2 exposures were causally associated with a greater risk of mortality, even at levels below the national standards. Among the three pollutants, PM2.5 posed the greatest public health concern. The estimated D-R relations provide particularly significant implications for US EPA reviewing NAAQS, as the causal D-R curves essentially infer the number of potential lives saved if air pollution concentrations were reduced to specific levels. For example, lowering the air pollutant concentration from the 70th to 60th percentiles would prevent 65,935 early deaths among elders per year.

Availability of data and materials

The exposure data are available from the corresponding author on reasonable request. The Medicare data are available upon request to the Centers for Medicare and Medicaid Services. The other data are publicly available, with sources described in the manuscript.

Abbreviations

PM2.5 :

Ambient fine particulate matter

O3 :

Ozone

NO2 :

Nitrogen dioxide

US EPA:

United States Environmental Protection Agency

NAAQS:

National Ambient Air Quality Standards

D-R:

Dose-response

IPW:

Inverse probability weighting

ZCTA:

ZIP Code Tabulation Areas

GBM:

Gradient boosting machine

ppb:

Parts per billion

COPD:

Chronic obstructive pulmonary disease

References

  1. 1.

    Schraufnagel DE, Balmes JR, Cowl CT, De Matteis S, Jung SH, Mortimer K, et al. Air pollution and noncommunicable diseases: a review by the forum of international respiratory Societies’ environmental committee, part 2: air pollution and organ systems. Chest. 2019;155(2):417–26. https://doi.org/10.1016/j.chest.2018.10.041.

    Article  Google Scholar 

  2. 2.

    Wei Y, Wang Y, Di Q, Choirat C, Wang Y, Koutrakis P, et al. Short term exposure to fine particulate matter and hospital admission risks and costs in the Medicare population: time stratified, case crossover study. BMJ. 2019;367:l6258.

    Article  Google Scholar 

  3. 3.

    Dockery DW, Pope CA, Xu X, Spengler JD, Ware JH, Fay ME, et al. An association between air pollution and mortality in six U.S. cities. N Engl J Med. 1993;329(24):1753–9. https://doi.org/10.1056/NEJM199312093292401.

    CAS  Article  Google Scholar 

  4. 4.

    Brook RD, Rajagopalan S, Pope CA 3rd, Brook JR, Bhatnagar A, Diez-Roux AV, et al. Particulate matter air pollution and cardiovascular disease: an update to the scientific statement from the American Heart Association. Circulation. 2010;121(21):2331–78. https://doi.org/10.1161/CIR.0b013e3181dbece1.

    CAS  Article  Google Scholar 

  5. 5.

    Jerrett M, Burnett RT, Pope CA 3rd, Ito K, Thurston G, Krewski D, et al. Long-term ozone exposure and mortality. N Engl J Med. 2009;360(11):1085–95. https://doi.org/10.1056/NEJMoa0803894.

    CAS  Article  Google Scholar 

  6. 6.

    Di Q, Dominici F, Schwartz JD. Air pollution and mortality in the Medicare population. N Engl J Med. 2017;377(15):1498–9. https://doi.org/10.1056/NEJMc1709849.

    Article  Google Scholar 

  7. 7.

    Burnett R, Chen H, Szyszkowicz M, Fann N, Hubbell B, Pope CA 3rd, et al. Global estimates of mortality associated with long-term exposure to outdoor fine particulate matter. Proc Natl Acad Sci U S A. 2018;115(38):9592–7. https://doi.org/10.1073/pnas.1803222115.

    CAS  Article  Google Scholar 

  8. 8.

    Bowe B, Xie Y, Yan Y, Al-Aly Z. Burden of Cause-Specific Mortality Associated With PM2.5 Air Pollution in the United States. JAMA Netw Open. 2019;2(11):e1915834.

    Article  Google Scholar 

  9. 9.

    Turner MC, Jerrett M, Pope CA 3rd, Krewski D, Gapstur SM, Diver WR, et al. Long-term ozone exposure and mortality in a large prospective study. Am J Respir Crit Care Med. 2016;193(10):1134–42. https://doi.org/10.1164/rccm.201508-1633OC.

    CAS  Article  Google Scholar 

  10. 10.

    Faustini A, Rapp R, Forastiere F. Nitrogen dioxide and mortality: review and meta-analysis of long-term studies. Eur Respir J. 2014;44(3):744–53. https://doi.org/10.1183/09031936.00114713.

    CAS  Article  Google Scholar 

  11. 11.

    Eum KD, Kazemiparkouhi F, Wang B, Manjourides J, Pun V, Pavlu V, et al. Long-term NO2 exposures and cause-specific mortality in American older adults. Environ Int. 2019;124:10–5. https://doi.org/10.1016/j.envint.2018.12.060.

    CAS  Article  Google Scholar 

  12. 12.

    Bennett JE, Tamura-Wicks H, Parks RM, Burnett RT, Pope CA 3rd, Bechle MJ, et al. Particulate matter air pollution and national and county life expectancy loss in the USA: a spatiotemporal analysis. PLoS Med. 2019;16(7):e1002856. https://doi.org/10.1371/journal.pmed.1002856.

    CAS  Article  Google Scholar 

  13. 13.

    Lilienfeld DE. Definitions of epidemiology. Am J Epidemiol. 1978;107(2):87–90. https://doi.org/10.1093/oxfordjournals.aje.a112521.

    CAS  Article  Google Scholar 

  14. 14.

    Owens EO, Patel MM, Kirrane E, Long TC, Brown J, Cote I, et al. Framework for assessing causality of air pollution-related health effects for reviews of the National Ambient air Quality Standards. Regul Toxicol Pharmacol. 2017;88:332–7. https://doi.org/10.1016/j.yrtph.2017.05.014.

    Article  Google Scholar 

  15. 15.

    Hill AB. The environment and disease: association or causation? 1965. J R Soc Med. 2015;108(1):32–7. https://doi.org/10.1177/0141076814562718.

    Article  Google Scholar 

  16. 16.

    Wang Y, Kloog I, Coull BA, Kosheleva A, Zanobetti A, Schwartz JD. Estimating causal effects of Long-term PM2.5 exposure on mortality in New Jersey. Environ Health Perspect. 2016;124(8):1182–8. https://doi.org/10.1289/ehp.1409671.

    Article  Google Scholar 

  17. 17.

    Schwartz J, Bind MA, Koutrakis P. Estimating causal effects of local air pollution on daily deaths: effect of low levels. Environ Health Perspect. 2017;125(1):23–9. https://doi.org/10.1289/EHP232.

    CAS  Article  Google Scholar 

  18. 18.

    Wei Y, Wang Y, Wu X, Di Q, Shi L, Koutrakis P, et al. Causal effects of air pollution on mortality in Massachusetts. Am J Epidemiol. 2020;189(11):1316–23. https://doi.org/10.1093/aje/kwaa098.

    Article  Google Scholar 

  19. 19.

    U.S. E. 40 CFR Part 50. National ambient air quality standards for particulate matter: Final rule. Fed Regist. 1997;62(138):38652–460.

    Google Scholar 

  20. 20.

    Cox LA. Do causal concentration-response functions exist? A critical review of associational and causal relations between fine particulate matter and mortality. Crit Rev Toxicol. 2017;47(7):609–37. https://doi.org/10.1080/10408444.2017.1311838.

    CAS  Article  Google Scholar 

  21. 21.

    Forastiere L, Carugno M, Baccini M. Assessing short-term impact of PM10 on mortality using a semiparametric generalized propensity score approach. Environ Health. 2020;19(1):46. https://doi.org/10.1186/s12940-020-00599-6.

  22. 22.

    WHO. WHO Expert Consultation: Available Evidence for the Future Update of the WHO Global Air Quality Guidelines (AQGs). Geneva: WHO; 2016.

  23. 23.

    Howden L, Meyer J. Age and sex composition: 2010 Census briefs: U.S. CENSUS BUREAU; 2011.

  24. 24.

    Weisskopf MG, Webster TF. Trade-offs of Personal Versus More Proxy Exposure Measures in Environmental Epidemiology. Epidemiology (Cambridge, Mass). 2017;28(5):635–43.

    Article  Google Scholar 

  25. 25.

    Di Q, Amini H, Shi L, Kloog I, Silvern R, Kelly J, et al. An ensemble-based model of PM2.5 concentration across the contiguous United States with high spatiotemporal resolution. Environ Int. 2019;130:104909.

    CAS  Article  Google Scholar 

  26. 26.

    Requia WJ, Di Q, Silvern R, Kelly JT, Koutrakis P, Mickley LJ, Sulprizio MP, Amini H, Shi L, Schwartz J. An Ensemble Learning Approach for Estimating High Spatiotemporal Resolution of Ground-Level Ozone in the Contiguous United States. Environ Sci Technol. 2020;54(18):11037–47. https://doi.org/10.1021/acs.est.0c01791.

  27. 27.

    Di Q, Amini H, Shi L, Kloog I, Silvern RF, Kelly JT, et al. Assessing NO2 concentration and model uncertainty with high spatiotemporal resolution across the contiguous United States using ensemble model averaging. Environ Sci Technol. 2019;54(3):1372–84.

    Article  Google Scholar 

  28. 28.

    Institute ESR. Esri Data & Maps 10. Redlands: An Esri White Paper; 2010.

    Google Scholar 

  29. 29.

    Mitchell KE, et al. The multi-institution North American Land Data Assimilation System (NLDAS): Utilizing multiple GCIP products and partners in a continental distributed hydrological modeling system. J Geophys Res. 2004;109:D07S90. https://doi.org/10.1029/2003JD003823.

  30. 30.

    Barreca AI. Climate change, humidity, and mortality in the United States. J Environ Econ Manag. 2012;63(1):19–34. https://doi.org/10.1016/j.jeem.2011.07.004.

    Article  Google Scholar 

  31. 31.

    Shi L, Kloog I, Zanobetti A, Liu P, Schwartz JD. Impacts of temperature and its variability on mortality in New England. Nat Clim Chang. 2015;5(11):988–91. https://doi.org/10.1038/nclimate2704.

    Article  Google Scholar 

  32. 32.

    Council NR. Using the American community survey: benefits and challenges. Washington, DC: The National Academies Press; 2007.

    Google Scholar 

  33. 33.

    CDC. Behavioral Risk Factor Surveillance System Survey Questionnaire. Atlanta: U.S. Department of Health and Human Services, Centers for Disease Control and Prevention; 2004.

    Google Scholar 

  34. 34.

    Cronenwett JL, Birkmeyer JD. The Dartmouth atlas of vascular health care. Cardiovasc Surg. 2000;8(6):409–10.

    CAS  Article  Google Scholar 

  35. 35.

    Robins JM, Hernan MA, Brumback B. Marginal structural models and causal inference in epidemiology. Epidemiology (Cambridge, Mass). 2000;11(5):550–60.

    CAS  Article  Google Scholar 

  36. 36.

    Rosenbaum PR, Rubin DB. The central role of the propensity score in observational studies for causal effects. Biometrika. 1983;70(1):41–55. https://doi.org/10.1093/biomet/70.1.41.

    Article  Google Scholar 

  37. 37.

    Cole SR, Hernan MA. Constructing inverse probability weights for marginal structural models. Am J Epidemiol. 2008;168(6):656–64. https://doi.org/10.1093/aje/kwn164.

    Article  Google Scholar 

  38. 38.

    Hernan MA, Brumback B, Robins JM. Marginal structural models to estimate the causal effect of zidovudine on the survival of HIV-positive men. Epidemiology (Cambridge, Mass). 2000;11(5):561–70.

    CAS  Article  Google Scholar 

  39. 39.

    Hirano K, Imbens GW. The Propensity Score with Continuous Treatments. In: Gelman A, Meng XL, editors. Applied Bayesian Modeling and Causal Inference from Incomplete-Data. Hoboken: John Wiley & Sons, Ltd; 2004. p. 73–84.

    Google Scholar 

  40. 40.

    Robins J. Marginal structural models versus structural nested models as tools for causal inference. Stat Models Epidemiol Environment Clin Trials. 2000:95–133. https://doi.org/10.1007/978-1-4612-1284-3_2.

  41. 41.

    Hernan MA, Robins JM. Estimating causal effects from epidemiological data. J Epidemiol Community Health. 2006;60(7):578–86. https://doi.org/10.1136/jech.2004.029496.

    Article  Google Scholar 

  42. 42.

    Naimi AI, Moodie EE, Auger N, Kaufman JS. Constructing inverse probability weights for continuous exposures: a comparison of methods. Epidemiology (Cambridge, Mass). 2014;25(2):292–9.

    Article  Google Scholar 

  43. 43.

    Rubin DB. For objective causal inference, design trumps analysis. Ann Appl Stat. 2008;2(3):808–40.

    Article  Google Scholar 

  44. 44.

    McCaffrey DF, Ridgeway G, Morral AR. Propensity score estimation with boosted regression for evaluating causal effects in observational studies. Psychol Methods. 2004;9(4):403–25. https://doi.org/10.1037/1082-989X.9.4.403.

    Article  Google Scholar 

  45. 45.

    Friedman J. Greedy function approximation: a gradient boosting machine. Ann Stat. 2011:1189–232.

  46. 46.

    Di Q, Dai L, Wang Y, Zanobetti A, Choirat C, Schwartz JD, et al. Association of Short-term Exposure to air pollution with mortality in older adults. JAMA. 2017;318(24):2446–56. https://doi.org/10.1001/jama.2017.17923.

    CAS  Article  Google Scholar 

  47. 47.

    Donaldson GC, Wilkinson TM, Hurst JR, Perera WR, Wedzicha JA. Exacerbations and time spent outdoors in chronic obstructive pulmonary disease. Am J Respir Crit Care Med. 2005;171(5):446–52. https://doi.org/10.1164/rccm.200408-1054OC.

    Article  Google Scholar 

  48. 48.

    DYC L. Outdoor-indoor air pollution in urban environment: challenges and opportunity. Front Environ Sci. 2015;2:69. https://doi.org/10.3389/fenvs.2014.00069.

Download references

Acknowledgements

Not applicable.

Funding

This publication was made possible by the United States Environmental Protection Agency (US EPA) grants RD-8358720 and RD-83587201-0. Its contents are solely the responsibility of the grantee and do not necessarily represent the official views of the US EPA. Further, the US EPA does not endorse the purchase of any commercial products or services mentioned in the publication. This publication was also made possible by National Institutes of Health (NIH) grants ES-000002, R01 ES024332–01, R01 MD012769, R01 ES028033, R21 ES024012, 1R01AG060232-01A1, 1R01ES030616, and 1R01AG066793-01R01, by Health Effects Institute (HEI) grant 4953-RFA14-3/16-4, by Alfred P. Sloan Foundation grant G-2020-13946, and by the Harvard University Climate Change Solutions Fund.

Author information

Affiliations

Authors

Contributions

Y.W. and J.S. designed research and performed analysis; M.D.Y., Q.D., W.J.R., F.D., and A.Z. prepared data; and Y.W. and J.S. wrote the paper. All authors helped interpret the results and provided comments. The author(s) read and approved the final manuscript.

Corresponding author

Correspondence to Yaguang Wei.

Ethics declarations

Ethics approval and consent to participate

This study was approved by the institutional review board at the Harvard T.H. Chan School of Public Health and was exempt from informed consent requirements as a study of previously collected administrative data.

Consent for publication

Not applicable.

Competing interests

Dr. Joel Schwartz serves as an expert witness for the United States Department of Justice in a case involving a Clean Air Act violation.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Wei, Y., Yazdi, M.D., Di, Q. et al. Emulating causal dose-response relations between air pollutants and mortality in the Medicare population. Environ Health 20, 53 (2021). https://doi.org/10.1186/s12940-021-00742-x

Download citation

Keywords

  • Air pollution
  • Chronic exposures
  • Mortality
  • Causal modeling
  • Does-response relations