Mortality data
We obtained Medicare enrollment records for beneficiaries aged 65 years and above residing in the contiguous US between 2000 and 2016 from the Centers for Medicare and Medicaid Services, with all-cause mortality as the study outcome. For each beneficiary, we extracted their demographic information (sex, race, age at initial enrollment), Medicaid eligibility, ZIP Code of residence, year of initial enrollment, and year of death if it occurred during the study period. We constructed an open cohort with person-years of follow-up in which each beneficiary was followed each year from the study entry until the end of study, drop out of the cohort, or death, whichever occurred earliest. Note that the same data format has been used to fit time-varying Cox proportional hazard models [6].
Exposure assessment
The daily concentrations of ambient PM2.5, O3, and NO2 at 1 km × 1 km grid cells across the contiguous US were predicted and validated using hybrid models that ensembled predictions from random forest, gradient boosting, and neural network. Multiple predictor variables were incorporated in the predictions, including ground monitoring data, satellite data, meteorological conditions, land-use variables, and chemical transport model simulations, etc., with details published elsewhere [25,26,27].
These high-resolution predictions at 1 km × 1 km grid cells allow us to estimated ZIP Code-level exposure levels with a high degree of accuracy, with annual R2 on held out monitors of 0.89 for PM2.5, 0.86 for O3, and 0.84 for NO2. There are two major types of ZIP Codes in the US: standard ZIP Code and PO Box. Because a standard ZIP Code represents a delivery area, we used the polygon layer generated by Environmental Systems Research Institute (Esri) [28], and estimated the ZIP Code’s daily concentrations by averaging the predictions at grid cells whose centroid points were inside the polygon of that ZIP Code. For PO Box, because it is used only for a given facility and therefore can be represented by a single point, we estimated its daily concentrations by linking it to the nearest grid cell.
The exposures of interest were assessed based on the ZIP Code-level estimates. For PM2.5 and NO2, we defined their chronic exposures as annual average concentrations. For O3, following previous literature [5, 6], we defined the chronic exposure as the average concentration during warm season (April–September) of the year. We assigned the chronic exposures to PM2.5, O3, and NO2 to each person-year based on that person’s ZIP Code of residence and calendar year.
Covariate information
Meteorological variables including daily air temperature and humidity at 2 m above the ground were extracted from Phase 2 of the North American Land Data Assimilation System, with 12 km × 12 km resolution across the continental US [29]. The average temperatures during warm (April–September) and cold seasons (January–March plus October–December) of each year were calculated from the daily data because both exceedingly low and high temperatures were physically stressful and were also associated with air pollution levels [30, 31]. Annual average humidity was calculated from the daily data on a yearly basis. ZIP Code Tabulation Areas (ZCTA)-level socioeconomic variables, including the percentage of Blacks, percentage of Hispanics, median household income, median value of owner occupied housing, percentage of Americans aged 65 and older living below the poverty threshold, percentage of Americans with less than high school education, percentage of owner occupied housing units, and population density, etc., were obtained from 2000 and 2010 US Census and the American Community Survey [32]. These variables were linearly extrapolated by year to account for the time varying nature of socioeconomic status. County-level behavioral variables, including body mass index (BMI) and percentage of ever smokers for each year, were obtained from the Behavioral Risk Factor Surveillance System [33]. From the Dartmouth Atlas of Health Care [34], we obtained percentage of Medicare participants who had a hemoglobin A1c test, a low-density lipoprotein cholesterol (LDLC) test, a mammogram, and an eye exam to a primary care physician for each year in each hospital catchment area in the US and assigned it to all ZCTAs in that area. We also computed the distance from each ZIP Code centroid to the nearest hospital. These variables were linked to each person-year by ZIP Code of residence and calendar year. Summary statistics of the covariates are provided in Section 5 of Supplementary Information.
A decile binning approach to emulate causal D-R relations
To emulate the causal D-R relationship, we need a counterfactual framework. For a binary exposure, the causal estimate in a population of interest comes from the difference between the counterfactual outcome under which all the members of the population had been exposed versus the counterfactual outcome had they not been exposed, thus no confounding occurs [35]. In randomized experiments, counterfactuals are constructed by randomly assigning individuals to treatment groups to ensure that exposure is independent of all potential confounders. In observational studies, however, the exposure assignment is not random but instead is considered to be influenced by subject characteristics, and causal methods seek ways to approximate counterfactuals with reference to the observed population [36]. Inverse probability weighting (IPW), for example, is a formal causal modeling technique and is increasingly being used in observational studies [37, 38]. For a binary exposure, it uses quasi-experimental design to construct a “pseudo-population” by weighting the population by the inverse probability of the observed exposure given all measured confounders. The “pseudo-population” is then used to estimate the exposure effect. If the systematic difference of characteristics among the exposed and unexposed is adequately adjusted so that the two groups are comparable with respect to any confounders, a causal conclusion is warranted [35, 39].
But air pollution exposure is continuous in nature. Estimating the inverse probability weights in the continuous setting is challenging as it needs to 1) correctly specify the distributional form of exposure, 2) deal with non-constant variance (heteroscedasticity), and 3) avoid excessively large or small weights for outliers that are more likely to occur [40, 41]. For these reasons, in this section we proposed a decile binning approach to emulate the causal D-R relations between chronic air pollution exposures and mortality by dividing each exposure into ten equal-sized groups by deciles, treating the lowest decile group (i.e., 10% of the study population with the lowest exposure levels) as the reference, and estimating the effects for the other groups compared to the reference. This relaxes the strong assumptions of distribution form and homoscedasticity for continuous exposure by relying solely on deciles. In addition, binning data makes the inverse probability weights robust against outliers [42].
We had a dataset with person-year representations of follow-up which allowed for time-varying exposures and covariates. To reduce the computational burden, first we aggregated the person-years with the same sex, race, age, Medicaid eligibility, living in the same ZIP Code of residence and in the same year. We treated them as a single record because those person-years had identical values for all exposures and covariates and thus could be treated interchangeably in the analysis. As a result, we retained all the information yet significantly reduced the size of the data in which each observation represented a stratum of combination of sex, race, age, and Medicaid eligibility per ZIP Code of residence per year. Numbers of deaths and person-years were cumulated for each stratum.
For each exposure, the analysis under the counterfactual framework was composed of two stages: a design stage where a randomized “pseudo-population” was constructed by weighting the observed population by the inverse probability of exposure given all measured confounders, and an analysis stage where the treatment effect was estimated among the constructed “pseudo-population” [43]. In the first stage, we binned the exposure into ten equal-sized categories based on deciles. The stabilized inverse probability weight (swij) for stratum j in exposure category i was defined as:
$$ {sw}_{ij}=\frac{P\left(X\in i\right)}{expit\left(g\left({X}_{ij};{n}_{ij}\ |\ \boldsymbol{C}\right)\right)} $$
where P(X ∈ i) denotes the probability of any observed exposure X being in group i, which equals to 0.1; expit(∙) denotes the inverse logistic link function where \( expit(x)=\frac{\mathit{\exp}(x)}{1+\mathit{\exp}(x)} \); and g(∙) the gradient boosting machine (GBM) model with logistic loss function for predicting the probability of the observed categorized exposure Xij given the set of confounders C, weighted by nij, the number of person-years aggregated in the stratum. The use of GBM for estimating the probability of observed exposure has demonstrated a better predictive accuracy compared to the conventional logistic regression, as it captures nonlinearity and interactions of confounders and is unaffected by the potential autocorrelation [44, 45]. The confounder set C includes the other two concurrent exposures, calendar year, the individual-level variables (sex, race, 5-year age group, and Medicaid eligibility), and the area-level meteorological, socioeconomic, behavioral, and medical access variables as detailed in the previous section. The numerator, P(X ∈ i), is used to stabilize the variability of weights to avoid excessively upweighting or downweighting observations [40].
In the second stage of the analysis, for each exposure, we fitted a log linear regression relating the number of deaths and factored exposure category, weighted by the stabilized inverse probabilities estimated from the first stage. We used quasi-Poisson link function to account for overdispersion in the number of death, and included an offset of the number of person-years to account for the different population size at risk in each stratum. As a result, we obtained the marginal effect of each decile group on mortality. If the model for estimating the stabilized inverse probability weight is correctly specified, we have achieved an unbiased estimator of the causal effect for each group [41].
The results are expressed as the relative risks (RR) of mortality for higher decile groups against the lowest-decile group (reference). The number of early deaths avoided by lowering air pollutant concentration of a higher to lower decile group can be calculated as \( N{\alpha}_0\left(\frac{RR_{high}-1}{RR_{high}}-\frac{RR_{low}-1}{RR_{low}}\right) \), where N is the annual averaged number of person-years, α0 is the baseline annual mortality rate, and RRhigh and RRlow are the relative risks of mortality for the higher decile group and the lower decile group, respectively. More details are provided in Section 4 of Supplementary Information.
We assessed the robustness of the causal dose-response relations between the chronic air pollution exposures and mortality risk by conducting sensitivity analysis of splitting each exposure into 14 bins.