Influence of exposure differences on city-to-city heterogeneity in PM2.5-mortality associations in US cities
© The Author(s). 2017
Received: 6 July 2016
Accepted: 23 December 2016
Published: 4 January 2017
Multi-city population-based epidemiological studies have observed heterogeneity between city-specific fine particulate matter (PM2.5)-mortality effect estimates. These studies typically use ambient monitoring data as a surrogate for exposure leading to potential exposure misclassification. The level of exposure misclassification can differ by city affecting the observed health effect estimate.
The objective of this analysis is to evaluate whether previously developed residential infiltration-based city clusters can explain city-to-city heterogeneity in PM2.5 mortality risk estimates. In a prior paper 94 cities were clustered based on residential infiltration factors (e.g. home age/size, prevalence of air conditioning (AC)), resulting in 5 clusters. For this analysis, the association between PM2.5 and all-cause mortality was first determined in 77 cities across the United States for 2001–2005. Next, a second stage analysis was conducted evaluating the influence of cluster assignment on heterogeneity in the risk estimates.
Associations between a 2-day (lag 0–1 days) moving average of PM2.5 concentrations and non-accidental mortality were determined for each city. Estimated effects ranged from −3.2 to 5.1% with a pooled estimate of 0.33% (95% CI: 0.13, 0.53) increase in mortality per 10 μg/m3 increase in PM2.5. The second stage analysis determined that cluster assignment was marginally significant in explaining the city-to-city heterogeneity. The health effects estimates in cities with older, smaller homes with less AC (Cluster 1) and cities with newer, smaller homes with a large prevalence of AC (Cluster 3) were significantly lower than the cluster consisting of cities with older, larger homes with a small percentage of AC.
This is the first study that attempted to examine whether multiple exposure factors could explain the heterogeneity in PM2.5-mortality associations. The results of this study were found to explain a small portion (6%) of this heterogeneity.
KeywordsParticulate matter Epidemiology Exposure Meta-regression Cluster analysis
Multi-city population-based epidemiological studies of short-term PM2.5 exposures and mortality have provided evidence of heterogeneity in risk estimates between communities and cities [1, 2]. The inability to explain the city-to-city heterogeneity, both nationally and within a region, in PM2.5 mortality risk estimates observed in multi-city studies remains a key uncertainty in the examination of the relationship between short-term PM2.5 exposures and mortality. One potential reason for these differences is the use of ambient monitors, such as those reported in the United States Environmental Protection Agency’s Air Quality System, as a surrogate for exposure. These fixed site monitors are often sited for regulatory and attainment purposes, so they are not optimal for obtaining representative exposures for individuals or groups with health concern. This may introduce bias into the observed risk estimates if the relationship between ambient monitor measurements and personal exposure estimates varies by city .
Exposures often vary in space and time due to an individual’s activities. For example, an individual’s exposure will vary depending on the location of their home, work or school, as well as their commuting patterns . People also spend the majority of their time indoors and where ambient PM2.5 can readily penetrate [5, 6]. This results in a substantial portion of an individual’s exposure to ambient PM2.5 occurring while indoors. Although individual activities vary within a day, ambient monitors are used to assign exposure in epidemiologic studies and this has been demonstrated to be an appropriate exposure surrogate due to ambient concentrations being well correlated with the ambient component of PM2.5 personal exposure . However, when focusing on the composition of PM2.5 it is quite possible it may differ from the composition measured at the ambient monitor compared to other near-source environments, such as in-vehicles. The observed city-to-city heterogeneity has not been clearly linked to any one PM2.5 component or source , nor has there been evidence that the city specific relationship between ambient concentrations of PM2.5 components and gaseous pollutants with PM2.5 mass explains any city-to-city heterogeneity . This has led to the hypothesis that exposure patterns (i.e., indoor and outdoor) may explain some of the heterogeneity in risk estimates observed in PM2.5-mortality studies.
Factors that differ between communities could be significant effect modifiers of the PM2.5-mortality association and may explain some of the heterogeneity across cities . In a previous analysis cities were clustered with similar exposure distributions based on residential infiltration and in-vehicle commuting characteristics . These characteristics included percent of homes with central air conditioning (AC), year home was built, home size, and in-vehicle commuting times and distances. The objective of this analysis is to determine whether these previously developed clusters can help explain city-to-city heterogeneity in PM2.5 mortality risk estimates. In this study, the association between PM2.5 and daily non-accidental mortality in 77 Core-Based Statistical Areas (CBSAs) is examined across the continental United States between 2001 and 2005. A CBSA is a geographic area that consists of one or more counties anchored by an urban center of at least 10,000 people plus adjacent counties that are socioeconomically tied to the urban center by commuting . For the remainder of this paper the term “city” will be used in place of CBSA. Effect modification by cluster will be examined to determine whether these city-varying characteristics result in differential mortality associations with PM2.5.
The association between daily PM2.5 concentrations and non-accidental mortality was determined for 77 cities across the continental United States for the years 2001–2005 using Poisson time-series models. The definitions for the 77 cities in this paper are based on the CBSAs of the White House Office of Management and Budget . Health events were sorted into multi-county metropolitan regions centered on existing air quality monitors according to either county of event or county of residence. Once city-specific risk estimates were obtained, cities were grouped into clusters based on the approach outlined in . Meta-regression was applied to examine the cluster assignment, and the individual characteristics (percent of homes with central AC, mean year home was built, and mean home size) used to develop the clusters.
Individual-level mortality data for the entire U.S. from 2001 through 2005 were obtained through the National Center for Health Statistics from administrative systems for vital event records maintained by State and local health departments (http://www.cdc.gov/nchs/about.htm). All mortality data used in this study provide non-confidential information on individuals including state of death, county of death, age, gender, date of death and primary cause of death. For this study, we examined only those individuals who died of non-accidental causes (10th revision ICD codes (ICD10) U01-Y98 were excluded). Three age groups were defined: 0 to 64 years, 65 to 74 years and 75 years and older.
Air pollution and meteorological data
The air pollution data was retrieved from the EPA’s Air Quality System Technology Transfer Network  which provides daily and hourly PM2.5 concentrations from the EPA’s National and State Local Ambient Monitoring Stations. There are typically multiple monitors located within a city with some monitors providing integrated daily measurements and others providing continuous hourly measurements of PM2.5. We focused on integrated daily measurements since the mortality data were only available at a daily time resolution.
When more than one PM2.5 monitor was available for a city it was necessary to determine which monitors represented the general population’s exposure over the desired area. First, all values from a given monitor which operated less than 6 months or had fewer than 30 observations were deleted. Next, correlation coefficients were computed for each pair of monitors within the county. All values in which a monitor was deemed uncorrelated with its neighboring monitors were deleted. These monitors most likely measured a local pollution source that would not represent the general population exposure over the entire city. A monitor was considered uncorrelated if it had a correlation of <0.8 with the majority of the monitors within that city.
Where N is the total number of measured values within a given city for the entire time period, n is the number of monitors within a city, T is the total number of days a given monitor recorded measured values, and x t,i is the daily PM2.5 value on day t at location i.
The methods used to identify clusters and the results of this analysis have been discussed in a previous paper . Briefly, in Baxter et al.  clusters represented cities with similar exposure distributions based on residential infiltration and in-vehicle commuting characteristics. Factors related to residential infiltration and commuting were developed from the American Housing Survey from 2001 to 2005 for 94 cities. These cities all had populations over 100,000. Two separate cluster analyses using a k-means clustering algorithm were conducted to cluster cities based on these factors. The first analysis only included residential infiltration factors (i.e. percent of homes with central AC, mean year home was built, and mean home size) while the second incorporated both infiltration and commuting (i.e. mean in-vehicle commuting time and mean in-vehicle commuting distance) factors. The focus of this analysis is only on the clusters based on residential infiltration factors as the results from the combination of infiltration and commuting factors resulted in too many clusters with a small number of cities in each cluster.
Characteristics of residential infiltration factors by cluster (source Baxter et al. )
Cluster 1(N a = 24)
Cluster 2 (N = 5)
Cluster 3 (N = 40)
Cluster 4 (N = 18)
Cluster 5 (N = 7)
% of homes with central air conditioning
Mean year home was built
Mean size of home (sq ft)
Cluster 1 consisted of cities with older, smaller homes with less central AC while homes in Cluster 2 cities were newer, larger, and more likely to have central AC. For the remaining clusters, cluster 3 represents cities with high prevalence of central AC with newer and smaller homes; cluster 4 represents cities with moderate prevalence of central AC with older and larger homes; and cluster 5 represents cities with low prevalence of central AC with older and larger homes. Additional file 1: Table S1 lists the cities s by cluster.
Of the 94 cities examined in Baxter et al. , only those cities with more than 500 observation days for PM2.5 were included in the current analysis, resulting in a total of 77 cities. Single city PM2.5-mortailty risk estimates for each of the 77 cities were determined using Poisson regression analysis. Mortality counts were aggregated by date of event and age group within each city, and matched with PM2.5 and meteorological data by date and city. These data were then analyzed to estimate the association between daily PM2.5 concentrations and daily mortality events while adjusting for time-varying confounders. Specifically, the single city time-series models were fit using the glm function in R assuming quasi-Poisson-distributed responses and the log link function. The linear predictor included a separate intercept for each age group, a linear effect of 2-day moving average (lag 0–1 days) PM2.5 concentration chosen based on previous studies , a nonlinear effect of time represented by a natural spline with 6 degrees of freedom per year to account for seasonal and longer-term trends in mortality, and nonlinear effects of same day temperature, 1-day lagged temperature, and dew-point temperature, represented in each case by a natural spline with 3 degrees of freedom.
The second stage of this analysis involved obtaining a pooled estimate using a meta-analysis of the city-specific estimates and their standard errors. The degree of heterogeneity in the pooled effect estimate over the cities was determined using a random effects meta-analysis to obtain the heterogeneity variance component. This component represents the true heterogeneity in response, above what would be expected by the stochastic variability in the estimates . To test for significant heterogeneity between city-specific risk estimates, which would indicate whether random effects were necessary in the meta-regression, a Q-statistic was computed. It was hypothesized that the aforementioned clusters could be significant effect modifiers of the PM2.5-mortality associations and may explain some of the city-to-city heterogeneity. Thus, meta-regression was applied using the clusters. All data was managed using R  and SAS 9.4 . In addition to using the clusters as the independent variable, the individual city characteristics (i.e. home age, home size, and presence of AC) used to develop the clusters were also examined.
Summary statistics of ambient PM2.5 concentrations (μg/m3) by cluster
Percent increase in non-accidental mortality for a 10 μg/m3 increase in 24-h average PM2.5 concentrations, lag 0-1
A second stage analysis was conducted in which the betas for each individual city were weighted by their corresponding standard errors and regressed against each categorical cluster variable resulting in a p-value of 0.07 and an R2 of 6%. This indicates that the cluster assignment was marginally significant in terms of explaining the city-to-city heterogeneity in PM2.5-mortality associations. Results from each cluster were contrasted against one another to determine which clusters were significantly different from one another. The health effects estimates in Cluster 1 and Cluster 3 were significantly lower than those in Cluster 4. In addition to examining cluster assignment, second stage analyses on the individual factors (home age, home size, and presence of AC) were also conducted. Home size was the only significant modifier with larger homes resulting in larger risk coefficients.
There remains uncertainty to the cause of the observed heterogeneity in city-to-city PM2.5-mortality associations in U.S. based multi-city studies. Often this heterogeneity has been attributed to the regional and national variation in components of PM2.5; however, a clear difference in the air pollution mixtures has yet to be identified. A previous study examining city-to-city differences in ambient concentrations and the relationship between PM2.5 components and gaseous pollutants with PM2.5 mass did not provide any evidence of clear differences in ambient concentrations or sources between cities . Additionally, the evidence from epidemiologic and toxicological studies has not demonstrated that any one component or source is more strongly related with specific health outcomes [8, 18, 19].
Infiltration, defined as the fraction of the outdoor concentration that penetrates indoors and remains suspended, varies between cities, between homes, and over time within homes . Allen at al. developed models to predict infiltration based on behavioral factors such as air conditioning use and windows opening that can vary seasonally . The wide variation observed in residential infiltration rates  supports the hypothesis that city-to-city differences in the personal exposure-ambient monitor relationship may also be contributing to the observed heterogeneity in PM2.5 mortality risk estimate across cities This has led to epidemiologic studies examining factors that may influence individual exposures and ultimately associations in studies of short-term air pollution exposures and various health outcomes (e.g., mortality), such as air exchange rate , infiltration rates , and air conditioning use . Each of these studies has provided initial information on the importance of accounting for factors that may influence individual exposure, but the overall air pollution exposures people encounter are dictated by a variety of factors, not just one at a time. To further expand upon some of these initial studies, Baxter et al.  used information on individual exposure factors representative of infiltration and commuting to create clusters of cities with similar exposure profiles to examine whether different exposure profiles help explain the city-to-city heterogeneity in PM2.5-mortality risk estimates.
Building off the work detailed in Baxter et al. , the objective of this analysis was to determine whether the previously developed clusters help explain the city-to-city heterogeneity in PM2.5-mortality associations. Overall, we reported a 0.33% increase in non-accidental mortality for a 10 μg/m3 increase in previous 2-day moving average (lag 0–1) PM2.5 concentrations for 77 U.S. cities for the years of 2001–2005. The examination of cluster assignment was found to be marginally significant in explaining the heterogeneity in PM2.5-mortality associations. When comparing results between clusters, we only observed evidence of significant differences in mortality associations between Cluster 1 and 4 and Cluster 3 and 4, while the PM2.5-mortality associations in Cluster 1 and 3 are significantly smaller in magnitude compared to those in Cluster 4. As a sensitivity analysis, we performed a meta-regression on the individual factors that comprise the clusters and found that only home size appeared to explain the heterogeneity in the PM2.5-mortality associations, with larger associations in larger homes. However, in Baxter et al.  home size was not well correlated with the other exposure factors included in the infiltration exposure factors evaluations.
Upon closer examination, there are clear differences in the housing characteristics between clusters that appeared to contribute in explaining the heterogeneity in PM2.5 mortality associations. Air exchange rates have been found to be higher in larger and older homes  and in homes with less central AC due to more opening of windows  resulting in higher exposures to outdoor PM and associations larger in magnitude. Homes in Cluster 1 were on average 426 ft2 smaller, had 27% less central AC, and similar age homes compared to Cluster 4. Larger health effect were observed in Cluster 4 suggesting higher exposures in those homes. This is in agreement with the significant findings on home size in the sensitivity analyses with larger homes size associated with larger health effect estimates. Similarly, the underlying exposure profiles of Cluster 3 and 4 may help explain the difference in associations between the two clusters. Cluster 3 has a larger percentage of homes with central AC, as well as homes that are newer and smaller than those in Cluster 4. This difference between Clusters 3 and 4 would indicate that air exchange rates may be smaller in Cluster 3 resulting in lower exposures, which would subsequently result in associations smaller in magnitude for Cluster 3.
It is important to recognize that this study is subject to inherent limitations. One of the main limitations in the epidemiological analysis was the potential for exposure error from using an adjusted average of PM2.5 concentrations from a few monitors to characterize a population exposure in each of the cities. However, PM2.5 is relatively spatially homogenous and studies of personal exposures have shown that temporal variability in outdoor PM2.5 concentrations are a good surrogate for temporal variability in personal PM2.5 exposures [26, 27]. There is also potential for exposure error as the exposure factors that were used to generate the clusters are surrogates rather than direct measurements of residential infiltration. Furthermore, while reductions in residential infiltration will reduce exposures to PM2.5 of ambient origin it will also increase exposures to PM2.5 generated from indoor sources. This indoor PM2.5 may be independently associated with adverse health effects. Finally, a more thorough evaluation of potential differences between the five clusters examined in this study was limited by the small number of cities that comprised Clusters 2 and 5.
Overall, this is the first study that attempted to examine whether multiple exposure factors could explain the heterogeneity in PM2.5-mortality associations. Not surprisingly, the results of this study only explain some of this heterogeneity as this is most likely due to a variety factors. In addition to the aforementioned differences in composition of the PM, variation in the city-specific PM2.5 mortality risk estimates could be due to differences in individual- or population-level characteristics between cities such as the age distribution of the population, the distribution of the population encompassing a certain socioeconomic status [2, 28]. In conclusion, this study demonstrates that multiple exposure factors should be considered in future endeavors to elucidate the underlying cause of the observed heterogeneity in PM2.5 mortality associations.
Core-based statistical area
- PM2.5 :
Fine particulate matter
The authors would like thank Lucas Neas of the U.S. EPA’s National Health and Environmental Effects Research Laboratory. The authors would like to thank Ana Rappold of the U.S. EPA’s National Health and Environmental Effects Research Laboratory and Tom Luben of the U.S. EPA’s National Center for Environmental Assessment for their review of this paper.
Availability of data and materials
The data will be available on the United States Environmental Protection Agency’s ScienceHub (http://sciencehub.epa.gov/sciencehub).
LB developed the design of the study, performed the meta-regression analyses, and drafted the manuscript; JC determined the health effect estimates and revised the manuscript critically for important intellectual content; JS assisted in the interpretation the data and revised the manuscript critically for important intellectual content; All authors read and approved the final manuscript.
The authors declare that they have no competing interests.
Consent for publication
Ethics approval and consent to participate
The views expressed in this article are those of the authors and do not necessarily reflect the views or policies of the U.S. Environmental Protection Agency.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
- Dominici F, Peng RD, Bell ML, Pham L, McDermott A, Zeger SL, Samet JM. Fine particulate air pollution and hospital admission for cardiovascular and respiratory diseases. JAMA. 2006;295(10):1127–34.View ArticleGoogle Scholar
- Franklin M, Zeka A, Schwartz J. Association between PM2.5 and all-cause and specific-cause mortality in 27 US communities. J Expo Sci Environ Epidemiol. 2007;17(3):279–87.View ArticleGoogle Scholar
- Baxter LK, Sacks JD. Clustering cities with similar fine particulate matter exposure characteristics based on residential infiltration and in-vehicle commuting factors. Sci Total Environ. 2014;470–471:631–8.View ArticleGoogle Scholar
- Ozkaynak H, Baxter LK, Dionisio KL, Burke J. Air pollution exposure prediction approaches used in air pollution epidemiology studies. J Expos Sci Environ Epidemiol. 2013;23(6):566–72.View ArticleGoogle Scholar
- Allen R, Larson T, Sheppard L, Wallace L, Liu LJS. Use of real-time light scattering data to estimate the contribution of infiltrated and indoor-generated particles to indoor air. Environ Sci Tech. 2003;37(16):3484–92.View ArticleGoogle Scholar
- Sarnat JA, Long CM, Koutrakis P, Coull BA, Schwartz J, Suh HH. Using sulfur as a tracer of outdoor fine particulate matter. Environ Sci Tech. 2002;36(24):5305–14.View ArticleGoogle Scholar
- United States Environmental Protection Agency. Integrated Science Assessment for Particulate Matter. Research Triangle Park: Office of Research and Development, National Center for Environmental Assessment-RTP Division; 2009.Google Scholar
- Stanek LW, Sacks JD, Dutton SJ, Dubois J-JB. Attributing health effects to apportioned components and sources of particulate matter: An evaluation of collective results. Atmos Environ. 2011;45(32):5655–63.View ArticleGoogle Scholar
- Baxter LK, Duvall RM, Sacks JD. Examining the effects of air pollution composition on within region differences in PM2.5 mortality risk estimates. J Expo Sci Environ Epidemiol. 2013;23(5):457–65.View ArticleGoogle Scholar
- Schwartz J. Assessing confounding, effectmodification, and thresholds in the association between ambient particles and daily deaths. Environ Health Perspect. 2000;108(6):563–8.View ArticleGoogle Scholar
- Geographic Terms and Concepts - Core Based Statistical Areas and Related Statistical Areas [https://www.census.gov/geo/reference/gtc/gtc_cbsa.html]. Accessed 24 Oct 2016.
- White House Office of Management and Budget. Revised Delineations of Metropolitan Statistical Areas, Micropolitan Statistical Areas, and Combined Statistical Areas, and Guidance on Uses of the Delineations of These Areas. 2013.Google Scholar
- United States Environmental Protection Agency, Air Quality System Data Mart [http://www.epa.gov/ttn/airs/aqsdatamart]. Accessed Nov 2012-Feb 2014.
- National Oceanic and Atmospheric Association, National Climatic Data Center [http://www.ncdc.noaa.gov/oa/ncdc.html]. Accessed Jan 2014.
- Franklin M, Koutrakis P, Schwartz J. The role of particle composition and the association between PM2.5 and mortality. Epidemiology. 2008;19(5):680–9.View ArticleGoogle Scholar
- R Development Core Team. R: A language and environment for statistical computing. Vienna: R Foundation for Statistical Computing; 2015.
- SAS Institute Inc., SAS 9.4 Help and Documentation, Cary: SAS Institute Inc., 2002-2013.
- Vedal S, Campen MJ, McDonald JD, Larson TV, Sampson PD, Sheppard L, Simpson CD, Szpiro AA. National Particle Toxicity (NPACT) initiative report on cardiovascular effects. Boston: Health Effects Institute; 2013.Google Scholar
- Lippmann M, Chen L, Gordon T, Ito K, Thurston GD. National Particle Component Toxicity (NPACT) initiative: integrated epidemiologic and toxicological studies of the health effects of particular matter components. Boston: Health Effects Institute; 2013.Google Scholar
- Chen C, Zhao B. Review of relationship between indoor and outdoor particles: I/O ratio, infiltration factor and penetration factor. Atmos Environ. 2011;45(2):275–88.View ArticleGoogle Scholar
- Allen RW, Adar SD, Avol E, Cohen M, Curl CL, Larson T, Liu L-JS, Sheppard L, Kaufman JD. Modeling the residential infiltration of outdoor PM2.5 in the Multi-Ethnic Study of Atherosclerosis and Air Pollution (MESA Air). Environ Health Perspect. 2012;120(6):824–30.View ArticleGoogle Scholar
- Sarnat JA, Sarnat SE, Flanders WD, Chang HH, Mulholland J, Baxter L, Isakov V, Ozkaynak H. Spatiotemporally resolved air exchange rate as a modifier of acute air pollution-related morbidity in Atlanta. J Expos Sci Environ Epidemiol. 2013;23(6):606–15.View ArticleGoogle Scholar
- Dai L, Zanobetti A, Koutrakis P, Schwartz JD. Associations of fine particulate matter species with mortality in the United States: a multicity time-series analysis. Environ Health Perspect. 2014;122(8):837–42.View ArticleGoogle Scholar
- Breen MS, Schultz BD, Sohn MD, Long T, Langstaff J, Williams R, Isaacs K, Meng QY, Stallings C, Smith L. A review of air exchange rate models for air pollution exposure assessments. J Expos Sci Environ Epidemiol. 2014;24(6):555–63.View ArticleGoogle Scholar
- Johnson T, Long T. Determining the frequency of open windows in residences: a pilot study in Durham, North Carolina during varying temperature conditions. J Exp Anal Environ Epidemiol. 2005;15(4):329–49.View ArticleGoogle Scholar
- Sarnat JA, Koutrakis P, Suh HH. Assessing the relationship between personal particulate and gaseous exposures of senior citizens living in Baltimore, MD. J Air Waste Manage Assoc. 2000;50(7):1184–98.View ArticleGoogle Scholar
- Sarnat JA, Brown KW, Schwartz J, Coull BA, Koutrakis P. Ambient gas concentrations and personal particulate matter exposure: implications for studying the health effects of particles. Epidemiology. 2005;16(3):385–95.View ArticleGoogle Scholar
- Ostro BD, Feng W-Y, Broadwin R, Malig BJ, Green RS, Lipsett MJ. The impact of components of fine particulate matter on cardiovascular mortality in susceptible subpopulations. Occup Environ Med. 2008;65(11):750–6.View ArticleGoogle Scholar