Spatio-temporal modeling of particulate air pollution in the conterminous United States using geographic and meteorological predictors

Background Exposure to atmospheric particulate matter (PM) remains an important public health concern, although it remains difficult to quantify accurately across large geographic areas with sufficiently high spatial resolution. Recent epidemiologic analyses have demonstrated the importance of spatially- and temporally-resolved exposure estimates, which show larger PM-mediated health effects as compared to nearest monitor or county-specific ambient concentrations. Methods We developed generalized additive mixed models that describe regional and small-scale spatial and temporal gradients (and corresponding uncertainties) in monthly mass concentrations of fine (PM2.5), inhalable (PM10), and coarse mode particle mass (PM2.5–10) for the conterminous United States (U.S.). These models expand our previously developed models for the Northeastern and Midwestern U.S. by virtue of their larger spatial domain, their inclusion of an additional 5 years of PM data to develop predictions through 2007, and their use of refined geographic covariates for population density and point-source PM emissions. Covariate selection and model validation were performed using 10-fold cross-validation (CV). Results The PM2.5 models had high predictive accuracy (CV R2=0.77 for both 1988–1998 and 1999–2007). While model performance remained strong, the predictive ability of models for PM10 (CV R2=0.58 for both 1988–1998 and 1999–2007) and PM2.5–10 (CV R2=0.46 and 0.52 for 1988–1998 and 1999–2007, respectively) was somewhat lower. Regional variation was found in the effects of geographic and meteorological covariates. Models generally performed well in both urban and rural areas and across seasons, though predictive performance varied somewhat by region (CV R2=0.81, 0.81, 0.83, 0.72, 0.69, 0.50, and 0.60 for the Northeast, Midwest, Southeast, Southcentral, Southwest, Northwest, and Central Plains regions, respectively, for PM2.5 from 1999–2007). Conclusions Our models provide estimates of monthly-average outdoor concentrations of PM2.5, PM10, and PM2.5–10 with high spatial resolution and low bias. Thus, these models are suitable for estimating chronic exposures of populations living in the conterminous U.S. from 1988 to 2007.


Background
Understanding the health impacts resulting from exposure to atmospheric particulate matter (PM) air pollution remains a priority for environmental public health. The physical and chemical characteristics of PM affect its relevance to human health, as demonstrated by the observed differences in behavior, composition, and health impacts for fine (PM<2.5 μm in aerodynamic diameter: PM 2.5 ) and coarse (2.5<=PM<10 μm in aerodynamic diameter: PM 2.5-10 ) particles [1][2][3]. These differences make it important to examine the health effects of PM using exposure assessment methods able to capture variation in the levels of each PM size fraction across the spatial and temporal scales relevant to health outcomes, especially when studies are conducted over large geographic areas. Traditionally, however, epidemiologic studies of the chronic health effects of PM air pollution have used crude methods to assess particulate exposures, estimating subject's chronic exposure either by imputing ambient concentrations from the nearest monitor or by using area-wide averages [2], thus ignoring within-city spatial gradients in air pollutant levels and restricting these studies to areas with nearby monitoring data.
In our previous work, we developed and validated spatiotemporal generalized additive mixed models (GAMMs) of outdoor PM 2.5 and PM 10 levels for the Northeastern and Midwestern U.S. that included geographic information system (GIS)-based time-invariant spatial covariates and time-varying covariates such as meteorological data [11][12][13]. We showed that PM 2.5 , PM 10 , and PM 2.5-10 levels were estimated with a high degree of accuracy (predicted values did not display bias, on average, in comparison with measured values) and precision (predicted values were strongly correlated with observed values) and that the models were able to account for both within-and betweencity variation in PM concentrations. When our models were used to assess chronic PM exposures in epidemiological studies, we found higher health risks than when simpler exposure assessment approaches were used [11,25,26], likely due to the models' ability to reduce exposure error by estimation of within-city variability in PM levels, specifically in traffic-related PM.
Our previous GIS-based spatio-temporal exposure models used for the Nurses' Health Study [11][12][13] were developed only for the Northeastern and Midwestern U.S. and for 1988-2002. In the present analysis, we expand the modeling domain to the conterminous U.S. and include PM 2.5 and PM 10 monitoring data through 2007. We demonstrate the predictive accuracy of a computationally efficient but flexible spatio-temporal modeling approach, applied to the conterminous U.S., which combines spatial smoothing and regionally-varying non-linear smooth functions of time-varying and time-invariant geographic and meteorological covariate effects. Also, we evaluate the potential for improved model prediction resulting from the use of geographic covariates with higher spatial resolution than those used previously for traffic density, population density, and point-source emissions density.

Methods
We developed three separate GIS-based spatio-temporal models of PM levels: 1) PM 2.5 from 1999-2007, 2) PM 2.5 from 1988-1998, and 3) PM 10 from 1988-2007. As with our previous models for the Northeastern and Midwestern U.S. [11][12][13], these models used measured PM concentrations, monitoring site locations, GIS-based locationspecific characteristics and location-and month-specific meteorological data, and spatial smoothing of monthly and long-term average levels to describe large-and smallscale spatial variability and temporal variability in PM 2.5 and PM 10 levels over time.
Air pollution, geographic, and meteorological data Air pollution data Monthly mean PM 2.5 and PM 10 values were calculated from available monitoring data using the same methods as for our previous models [12,13] [27], from three Harvard-based research studies: the "Five Cities" study [28], the "24 Cities" study [29], and the "Six Cities" study [30], and from the Southern Aerosol Research and Characterization Study (SEARCH) network [31]; summary statistics on the monitoring data can be found in Additional file 1: Table S1. Monthly means were calculated by first averaging 24-hr mean values at each monitoring site, and then averaging the daily (with the exception of CASTNet which provided 2-week means) site means within the calendar month, provided that greater than approximately 70% of the nominal days had valid PM 2.5 or PM 10 values. The AQS contributed the bulk of the monthly means and sites (94 and 91%, respectively, for the 1999-2007 PM 2.5 model; 89 and 86%, respectively, for the 1988-1998 PM 2.5 model; and 93 and 89%, respectively, for the PM 10 model).

Geographic data
Characteristics of the PM monitoring sites were quantified using a GIS (ArcMap 10.1, Environmental Systems Research Institute (ESRI), Redlands, CA). We considered only geographic data available (i.e., non-missing) over the conterminous U.S. to facilitate generating model predictions at any location within this domain. The Albers Equal Area Conic U.S. Geological Survey (USGS) projection was used for all geographic data.
We estimated traffic density using data from the U.S. Bureau of Transportation Statistics 2005 National Highway Planning Network (NHPN) [32] using a kernel density function (ESRI Spatial Analyst) evaluated on a 30 m cell size raster. The kernel density approach involves deriving locally varying values by applying weights from a quadratic kernel within a specified neighborhood [33]. The neighborhood for this function was specified at 100 m based on data from previous studies of near-road pollutant decay [34,35]. Distance to nearest road values were also generated for each monitoring site for U.S. Census Feature Class Code (CFCC) road classes A1 (primary roads, typically interstates, with limited access), A2 (primary major, non-interstate roads), A3 (smaller, secondary roads, usually with more than one lane in either direction), and A4 (roads used for local traffic usually with one lane in either direction) roads using ESRI StreetMap Pro 2007 road network data. Distance to road values were truncated at 500 m; as a result this term represented only micro-to middle-scale local variability in PM levels near roadways.
The proportion of residential (low-intensity and highintensity) and urban (low-intensity and high-intensity residential, and industrial/commercial/transportation) land use was calculated for each location using neighborhoods of 1 and 4 km, using data from the U.S. Geological Survey (USGS) 1992 National Land Cover Dataset [36]. Tract-level population density data derived from the 1990 U.S. Census were obtained [37] and converted to a 500 m cell-size raster, based on the location of the center of each cell. Density values at each cell were averaged with the values at four adjacent cells, one in each cardinal direction, to reduce spatial discontinuities across cells. County-level population density data from the 1990 U.S. Census were obtained from ESRI Data & Maps and were spatially smoothed from county geographic centroids to prediction locations using a generalized additive model (GAM) with spatial bivariate thin-plate penalized splines [38].
We estimated the density of point-source emissions of PM 2.5 and PM 10 using kernel density functions (ESRI Spatial Analyst) with neighborhoods of 3, 7.5, and 15 km and data from the U.S. EPA's 2002 National Emissions Inventory [39]. In our earlier work, 1 and 10 km buffers were used [11][12][13]. Larger neighborhoods were chosen for this analysis to reflect more distant sources; however, values at greater distances were down-weighted due to use of the kernel density function. Also, neighborhoods with<=50% overlap were chosen, to minimize collinearity.
Elevation data were obtained in raster format from the USGS's National Elevation Dataset [40] (with a native resolution of~30 m) and averaged using a moving window with a radius of 300 m.
The traffic density within 100 m, distance to nearest road, tract-and county-level population density, and point-source emissions density covariates were natural-log transformed, after the addition of a constant, to obtain a more uniform distribution and thereby improve stability in the estimation of the penalized spline smooth functions, using the formula: where Z i. t is the transformed covariate, Z* i,t is the covariate on the native scale, and the constant 10 was chosen to reduce the leverage of values near zero. Similarly, the elevation covariate was transformed using a square root transformation after adding a constant to ensure a minimum value of one.

Meteorological data
Monthly average wind speed, temperature, and total precipitation measurements were obtained from the National Climatic Data Center (NCDC) and spatially smoothed using separate GAMs, as specified below, for each month and for each of seven regions of the conterminous U.S. (Figure 1), with region boundaries based loosely on the U.S. Census Regions [41]. Monthly predictions of the meteorological parameters at the prediction locations (monitoring sites, grid points, or geocoded subject residences) were then made using the fitted models. The form of these models was: where y i,t represents the measured values for a given meteorological parameter at i = 1… I r sites in each of seven geographical regions indexed by r (Northeast, Midwest, Southeast, Southcentral, Southwest, Northwest, and Central Plains; Figure 1) and t = 1…T monthly time periods (T =240 for 1988-2007), and s i is the projected spatial coordinate pair for the ith location. g t,r (s i ) accounts for residual monthly spatial variability within the region, specified as spatial bivariate thin-plate penalized spline terms with basis dimension k t,r = I t,r * 0.9. The value of 0.9 was chosen such that the basis dimension was as large as practicable which allowed the data to determine the complexity of the fitted functions, but was essentially arbitrary.
To reduce the potential for over-fitting, a multiplier of 1.4 (using the gamma argument to gam()), as recommended by Wood [39], p. 195, was used. Additionally, data on the percentage of stagnant air days per month from the NCDC's Air Stagnation Index [42] were obtained, naturallog transformed, and spatially smoothed using GAMs (Equation 2) for each month and region.

model
The generic form of the 1999-2007 PM 2.5 model was: where y i,t is the natural-log transformed monthly average PM 2.5 for i = 1…I r sites in each of the seven geographic regions indexed by r ( Figure 1) and I sites in total and t = 1…T monthly time periods (T=108 for PM 2. 5 from 1999-2007), and s i is the projected spatial coordinate pair for the ith location. X i,q are GIS-based timeinvariant spatial covariates for q = 1…Q, Z i,t,p are timevarying covariates for p = 1…P, and α t,r is a monthly intercept that represents the mean across all sites within the region. d q and f p,r are one-dimensional penalized spline smooth functions for Q GIS-based time-invariant spatial covariates and P time-varying meteorological covariates, respectively, each with a basis dimension of 10. g t,r (s i ) accounts for residual monthly spatial variability within the region, and g(s i ) for time-invariant spatial variability across the conterminous U.S., with both terms specified as spatial bivariate thin-plate penalized splines with basis dimension values: k t,r = I t,r * 0.9 and k = (I − Q) * 0.9, respectively. The site-specific random effect b i represents unexplained site-specific variability; thus our characterization of the model as a GAMM. We used a two-stage modeling approach to fit the above model (Equation 3). In the first stage (Equation 4), we estimated site-specific intercepts (u i ) adjusting for timevarying covariates and residual monthly spatial variability separately for each of the seven geographic regions. This allowed the effects of time-varying covariates to vary among the regions, and assumed that the residual monthly spatial terms were stationary and isotropic only within the region rather than across the entire conterminous U.S. Fitting the first stage regionally, rather than for the entire conterminous U.S. at once, also reduced the computational burden of model fitting, necessary due to the large number of monthly observations (120,618 for PM 2.5 from 1999-2007). Data from areas of adjacent states within about 400 km of each region were included in the regional first-stage models to minimize potential boundary effects. The first stage model equation was: and was fit iteratively for each region in a back-fitting arrangement [43,[11][12][13] with u i þ α t;r þ X p f p;r Z i;t;p À Á estimated jointly and g t,r (s i ) estimated separately by month, such that variability in the measured concentrations is parsed between the covariates and the residual spatial terms. For the spatial models in the first stage, a multiplier of 1.4 (using the gamma argument to gam()) was used to avoid over-fitting [39], p. 195, except in the Northwest region, where, due to limited data, a value of 1.8 was used. All individual fits within the back-fitting were done using the gam() function in the mgcv package [44] of R [45].
In the second stage, we fit a spatial model to the û i vector of values obtained from the regional first-stage models using GIS-based time-invariant spatial covariates and residual time-invariant spatial variability. To do this, we combined the regional data sets from the first stage, after eliminating the overlapping data from the 400 km regional buffers. Thus the second stage (Equation 5) was fit to data from the entire conterminous U.S. (i.e., all seven regions), and was: where uˆi is an estimated site-specific intercept that represents the adjusted long-term mean at each location; the other terms are as above. The second stage was also fit using the gam() function in the mgcv package of R. Because we included data from the entire conterminous U.S. in the second stage, we investigated the extent to which the time-invariant covariate effects varied by region of the country. We did this by including interaction terms by region, adding: ately to the model for each covariate q, where α r is a categorical variable for the main effect of region, and M i is a zero/one indicator for whether location i is in a given region or not. We also explored regional interactions of covariate effects that varied smoothly in space using tensor products of penalized smoothing spline bases [39].
The 1988-1998 PM 2.5 model As in our previous work [13], the generic form of the 1988-1998 PM 2.5 model was: where the terms are as above except that the response variable y i,t is the natural-log transformed ratio of monthly average PM 2.5 to model predicted PM 10 for 1988-1998. We calculated PM 2.5-10 levels at unmeasured locations and months by subtracting predicted PM 2.5 from predicted PM 10 (PM 2:5−10 i;t ¼ PM 10 i;t −PM 2:5 i;t , notation as above).
We generated estimates of uncertainty in model predictions (i.e., standard errors) from the 1999-2007 PM 2.5 and 1988-2007 PM 10 models on the natural-log scale using methods described previously [11,12]. For these models, 95% prediction intervals on the natural-log scale were calculated and exponentiated to assess prediction interval coverage. To generate standard errors for the 1988-1998 PM 2.5 model, we propagated errors in the predicted PM 2.5 :PM 10 ratio and predicted PM 10 levels on the native scale (see Additional file 1 for details). We also propagated errors in the PM 2.5-10 predictions on the native scale using standard methods, assuming independence among the PM 2:5 and PM 10 errors. For 1988-1988 PM 2.5 model predictions as well as for PM 2.5-10 predictions, prediction interval coverage was assessed using 95% prediction intervals based on these nativescale standard errors.

Model validation
We used 10-fold out-of-sample cross-validation (CV) to evaluate model predictive accuracy and thereby inform covariate selection. For the 1999-2007 PM 2.5 and 1988-2007 PM 10 models, monitoring sites were selected at random and assigned exclusively to one of 10 sets. Because few PM 2.5 data were available prior to 1999, we used data from the year 2000 for CV of the 1988-1998 PM 2.5 model. To do this, we first identified a subset of sites that reported at least 10 monthly PM 2.5 values in 2000 and at least 70 monthly PM 2.5 values across 1988-2007. We then randomly selected from among these sites data not to be used for CV (ensuring reasonable spatial coverage within each region by manipulating the random seed), with the goal of making the data for 2000 similar to that in years prior to 1999 for the purpose of model fitting. We subsequently divided the remaining monitoring sites that reported data in 2000 at random and assigned each site exclusively to one of 10 sets. Since the covariate selection process involved fitting multiple candidate models to the same data, set 10 was reserved (i.e., not used for model fitting) to assess whether the covariate selection process contributed to over-fitting. Data from sets one to nine (each set contains approximately 10% of the data for the 1999-2007 PM 2.5 and 1988-2007 PM 10 models) were removed from the data set sequentially, with the model fit to the remaining data and model predictions generated at the locations and months of the left-out observations.
The predictive accuracy of each PM model was determined from the squared Pearson correlation between the monthly left-out observations and model predictions (CV R 2 ), with both on the native rather than the natural-log scale. Spatial CV R 2 values were calculated similarly but on the long-term means (i.e., one mean per site) of the monthly values. Prediction errors were calculated by subtracting left-out observations from the model predictions. Bias in model predictions was determined using the normalized mean bias factor (NMBF) [Shaocai Yu, personal communication] and the slope from major-axis linear regression [46] of the natural-log transformed left-out observations against the natural-log transformed model predictions. The precision of model predictions was obtained by taking the mean of the absolute value of the prediction errors (CVMAE) and using the normalized mean error factor (NMEF) [Shaocai Yu, personal communication]. Formulas for the NMBF and NMEF are provided in Additional file 1. Bias and precision values from CV were evaluated overall, and by region of the country, urban land use, season, monitoring network, and monitoring objective.

Model development and covariate selection
For each model, we first fit a 'base' model using the following covariates based on our earlier work [11][12][13]: distance to nearest road for U.S. CFCC road classes A1-A3, smoothed county-level population density, urban land use within 1 km, elevation, point-source emissions density within 7.5 km (of PM 2.5 emissions for the PM 2.5 models and PM 10 emissions for the PM 10 model), smoothed monthly average wind speed, temperature, total precipitation, and air stagnation. The 1988-2007 PM 10 model also included tract-level population density. To ensure a parsimonious model specification, we then removed each timevarying term to evaluate its contribution and kept in the model only those that improved predictive accuracy (using the 'base' set of covariates in the second-stage model). Using the remaining time-varying covariates, we then added or substituted GIS-based time-invariant spatial covariates into the second stage of the model, selecting the model with the highest spatial CV R 2 , after removing those not statistically significant (p>0.05) per the result of Wald tests [38]. As in prior work, only those covariates expected a priori to have a positive or negative physical influence on PM levels were considered for inclusion. For example, increasing wind speed (a proxy for the amount of vertical mixing in the atmosphere) was expected to result in decreased PM 2.5 concentrations due to dilution of pollutant emissions. Non-linearity in covariate effects was accounted for using penalized spline terms, with the 'sp' argument to gam() used to limit each penalized spline terms to at most six degrees of freedom (df). Similarly, we used the 'sp' argument to gam() to evaluate reducing the df of the spatial term in the second stage of the 1999-2007 PM 2.5 model.

Results
Spatial patterns in model predicted PM 2.5, PM 10 Table S1 overall, by region, and by network for the 1999-2007 and 1988-1998 time periods. Generally, PM 2.5 levels were highest in southern California, and were elevated across the eastern as compared to western U.S. PM 10 and PM 2.5-10 levels were also highest across the Southwest and Central Plains regions (presumably due to greater contributions from windblown dust than in other areas), and were generally more spatially variable than PM 2.5 . Areas of higher elevation had generally lower predicted PM 2.5 and PM 10 levels. Increases in model predicted PM levels in areas with higher urban land use are also evident, especially for PM 2.5 .
At the spatial resolution (6 km) shown in Figures 2, 3 and 4, it is not possible to discern the micro-and middlescale impacts of the distance to road covariates, though they are evident in Figures 5, 6 and 7, which display model predicted PM 2.5 , PM 10 , and PM 2.5-10 concentrations on a 30 m grid in a selected area of New York City, New York for August 2006. Of note, sharp gradients in tract-level population density in this area together with the decreasing smooth function for this covariate result in several somewhat abrupt changes in predicted PM 10 and therefore also predicted PM 2.5-10 .
Maps of the mean standard errors of monthly PM 2.5 , PM 10 , and PM 2.5-10 model predictions for the conterminous U.S. are shown in Additional file 1: Figures S4-S6. Though the spatial patterns in the mean of the standard errors (with higher values corresponding to greater average uncertainty in monthly model predictions) for each PM size fraction are similar to the corresponding spatial pattern in mean model predictions, standard errors from the 1999-2007 PM 2.5 model are comparatively higher than model predictions in the Central Plains region (in eastern Kansas, for example), and in northwestern Nevada. Also of note, the magnitude of the standard errors from the 1988-1998 PM 2.5 model is generally greater than that from the 1999-2007 PM 2.5 model, reflecting uncertainty related to the estimation of the PM 2.5 :PM 10 ratio and, separately, of PM 10 levels. A map of the mean predicted PM 2.5 :PM 10 ratio across 1988-1998 is presented in Figure 8. The estimated ratio is generally higher in the eastern as compared to the western U.S., though areas of the Northwest region are also higher as compared to the rest of the western U.S.

CV results
Results from CV for 1999-2007 for PM 2.5 , PM 10 , and PM 2.5-10 are presented in Table 1; for 1988-1998 they  Table 2. Overall and when stratified by region, spatial CV R 2 values were higher than those for corresponding monthly values for PM 2.5 , PM 10 , and PM 2.5-10 across 1999-2007 and 1988-1998 (Tables 1  and 2, respectively). CV statistics by season, tertiles of urban land use, monitoring network, and monitoring objective are presented for each of the two time periods above in Additional file 1: Table S2 for PM 2.5 , PM 10 , and PM 2.5-10 . For both time periods, predictive accuracy was generally consistent across tertiles of urban land use, monitoring network, and monitoring objective for PM 2.5 , PM 10 , and PM 2.5-10 . Also model predictive performance was consistent across seasons, though generally slightly lower in the winter season as compared to other seasons. Density scatter plots of monthly measured vs. model predicted PM 2.5 , PM 10 , and PM 2.5-10 levels from CV are shown in Additional file 1: Figure S2.
CV results for 1999-2007 PM 2.5 Across the conterminous U.S., predictive accuracy of the 1999-2007 PM 2. 5 model was high (CV R 2 =0.77) at the monthly average level, though lower in the Northwest at 0.50. Across regions, model predictions exhibited low bias and high precision (NMBF of −1.6% and NMEF of 14.3%, respectively), but were less precise in the west (Southwest, Northwest, and Central Plains regions). Standard errors in monthly PM 2.5 model predictions were reasonably well-scaled (prediction interval coverage of 0.98). The model predicted long-term spatial trends very well (spatial CV R 2 =0.89).
PM 10 Across the conterminous U.S., predictive accuracy for PM 10 monthly model predictions was moderate (CV R 2 =0.58) for 1999-2007, though lower in the Southcentral, Northwest, and Central Plains regions (>0.45). Across regions, we found low bias in model predictions but only moderate precision (NMBF of −5.1% and NMEF of 24.4% across regions, respectively). Standard errors were reasonably well-scaled for the PM 10 model (prediction interval coverage (across 1988-2007) of 0.97). The model predicted long-term spatial trends well (spatial CV R 2 =0.69). PM 2.5-10 Across the conterminous U.S., predictive accuracy for PM 2.5-10 was moderate (CV R 2 =0.52). Across regions, we found low bias but poorer precision than for PM 2.5 or PM 10 (NMBF of −3.2% and NMEF of 38.9%, respectively). In the Southcentral region, bias in PM 2.5-10     low bias but only moderate precision (NMBF of −3.3% and NMEF of 21.8% across regions, respectively).

PM 2.5-10
Predictive accuracy was moderate for PM 2.5-10 (CV R 2 =0.46 across regions). Across regions, model predictions again exhibited low bias but precision was poorer than for PM 2.5 or PM 10 during the same time period (NMBF of −4.5% and NMEF of 42.5%, respectively). In the Northeast region, bias in PM 2.5-10 monthly values was slightly larger and negative (NMBF of −14.6%), whereas in the Northwest region it was also larger but positive (NMBF of 27.0%). Predictive accuracy was also substantially lower in the Southeast region (CV R 2 =0.12). Interestingly, this decrease in predictive accuracy does not appear to be related to the lower levels of measured PM 2.5-10 in the Southeast region; by contrast the levels in the Northwest region are comparable (Additional file 1: Table S1) but predictive accuracy in this region was not markedly reduced (CV R 2 =0.54).

Model covariate effects 1999-2007 PM 2.5 model covariates
Several GIS-based time-invariant spatial covariates were found to be important predictors in the 1999-2007 PM 2.5 model, including: elevation, urbanized land use within 1 km, county-level population density, distance to nearest A1, A2, and A3 roads, and point-source emissions density within 7.5 km. We found significant interactions by region in the effects of two GIS-based time-invariant spatial covariates: urban land use within 1 km and elevation.
For urban land use within 1 km, regional effects in the Midwest, Southeast, Northwest, and Central Plains regions were significantly different from the remaining regions. The estimated smooth functions for this covariate, from the 1999-2007 PM 2.5 model, showed that it was generally associated with increasing PM 2.5 (after adjusting for other model covariates), with the pattern varying slightly by region (Additional file 1: Figure S1 panel A5).
For elevation, regional effects in the Southwest, Northwest, and Central Plains regions were significantly different from the remaining regions. Increasing elevation was generally associated with decreasing PM 2.5 , with the effects varying substantially by region, especially in the Northwest region (Additional file 1: Figure  S1 panel A2). Though not visible in Figures 2, 3, 4, 5 and 6, regional covariate effects resulted in small spatial discontinuities at regional boundaries in monthly prediction surfaces.
Surprisingly, traffic density within 100 m performed slightly worse than distance to road covariates (A1-A3). This may have resulted from poorer spatial accuracy of the network of roads used by the NHPN as compared to the ESRI StreetMap Pro 2007 road network. Distance to the nearest A4 road did not increase predictive accuracy and was removed from the 1999-2007 PM 2.5 model.
Increasing county-level population density was positively associated with measured PM 2.5 levels, as was increasing point-source emissions density within 7.5 km (Additional file 1: Figure S1 panels A6 and A7, respectively).
As expected due to dilution and wet deposition processes, respectively, increasing levels of wind speed and total precipitation had consistent negative effects on PM 2.5 levels in each of the seven regions (with the exception of wind speed in the Midwest). The effect of temperature on PM 2.5 levels differed slightly by region (Additional file 1: Figure S1 panel A1), although PM 2.5 levels generally decreased with increasing temperature. We hypothesize that this counterintuitive result may be due to cold temperatures acting as a proxy for local wood smoke emissions and less mixing in the atmosphere. In contrast, during warm seasons, higher PM 2.5 levels due to increased photochemical production of secondary aerosol result in a less spatial variability in PM 2.5 which is better captured by the monthly intercept and monthly spatial smooth terms in non-winter seasons as compared to in winter. We also note that the moderate correlation between temperature and air stagnation (Pearson's r = 0.69) may interfere with direct interpretation of the effect of temperature alone. Air stagnation was found to improve predictive accuracy in only the Midwest and Southeast regions, with increasing stagnation associated with increasing PM 2.5 levels (Additional file 1: Figure S1 panel A11), though in other regions, especially the southwest, it was inversely associated with PM 2.5 levels.
The second-stage spatial term g(s i ) exhibited substantial complexity in the 1999-2007 PM 2.5 model, using 501.6 df. In contrast, the monthly spatial terms g t,r (s i ) used fewer df (median across region and months of 22.7).

1988-1998 PM 2.5 model covariates
For the 1988-1998 PM 2.5 model, only predicted PM 10 and elevation remained in the model as spatial covariates. However, the same four meteorologic covariates as for the 1999-2007 PM 2.5 model were included in this model. Their effects were similar, except for that of total precipitation where the ratio increases and then decreases, reflecting the complexity of differential wet deposition processes for fine and coarse mode particles. We found a significant interaction by region in the effect of elevation, with the effect in the Northwest region significantly different from that in the remaining regions (Additional file 1: Figure S1 panel B2).
The second stage spatial term g(s i ) exhibited substantial complexity, using 503.3 df; the seasonal spatial terms g Seas,r (s i ) used fewer df (median of 174.1 across regions and seasons), indicating greater residual spatial variability in the seasonal (natural-logged) PM 2.5 PM 10 ratio than in the monthly spatial terms from the 1999-2007 PM 2.5 or 1988-2007 PM 10 models. For the 1988-2007 PM 10 model, we found significant regional interactions only for urban land use within 1 km, with effects in the Northeast, Northwest, and Central Plains regions different from that in the remaining regions. The estimated smooth functions for this covariate showed that it was generally associated with increasing PM 10 (after adjusting for other model covariates), with the pattern varying slightly by region (Additional file 1: Figure S1 panel C4). Tract-level population density was negatively associated with measured PM 10 levels (Additional file 1: Figure S1 panel C8).

1988-2007 PM 10 model covariates
The second-stage spatial term g(s i ) exhibited substantial complexity in the 1988-2007 PM 10 model, using 882.6 df. In contrast, the monthly spatial terms g t,r (s i ) used fewer df (median across regions and months of 20.1).

Modeling assumptions
Our modeling approach assumes stationary and isotropic spatial variation, that covariate effects are additive, and that model residuals are independent and normally distributed, with mean zero and constant variance. We evaluated the assumption of stationarity in the second stage spatial term in alternative second stage models that allowed the smoothing parameter to vary across the domain (adaptive bases), including those that allowed stationarity to vary by urbanness, but these did not substantially change model fit nor increase predictive accuracy. We also evaluated whether the effects of the GIS-based timeinvariant spatial covariates (other than urban land use) varied with urbanness by stratifying by tertiles of urban land use within 1 km; we found no evidence of differential covariate effects by urbanness. Finally, we evaluated temporal autocorrrelation in model residuals; the resulting plots are provided in Additional file 1: Figure S3. Though the plot for the 1988-1998 PM 2.5 model residuals shows only limited evidence of autocorrelation, plots for the other two models show evidence of modest autocorrelation at a lag of one month and, to a lesser extent, seasonal dependence (at lags~12 months) not accounted for in the modeling.

Discussion
Our modeling approach provides predictions of monthly outdoor PM 2.5 , PM 10 , and PM 2.5-10 levels at any location within the conterminous U.S. with high spatial and temporal (i.e., monthly) resolution over a 20-year period (1988)(1989)(1990)(1991)(1992)(1993)(1994)(1995)(1996)(1997)(1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007) This approach gives our models the ability to account for micro (<100 m) , middle (100-500 m), neighborhood (500 m-4 km), and urban (4-50 km)-scale spatial gradients as well as larger-scale regional effects that vary over time. Further, this approach has the added benefit of straightforward interpretation of covariate effects on predicted PM levels, albeit where not obscured by collinearity or concurvity. Since model predictions can be made at a subject's residence or other relevant point location, rather than interpolated from a pre-defined grid, our models offer high spatial resolution which may reduce exposure error when estimating chronic exposures in epidemiologic studies, as has been shown in previous analyses [11,25,26]. The models have been used to provide PM 2.5 , PM 10 , and PM 2.5-10 monthly exposure estimates at subject residences in recent epidemiologic analyses [47,48]. Of the covariates evaluated for inclusion in the three models, several were found to be important predictors in each of the three models: wind speed, air temperature, total precipitation, air stagnation, and elevation. Also, the 1999-2007 PM 2.5 and 1988-2007 PM 10 models both included county population density, point-source emissions density (for the corresponding PM size fraction), distance to nearest road for road classes A1-A3, and urban land use within 1 km. Also, in the 1999-2007 PM 2.5 and 1988-1998 PM 2.5 models, we found regional variation in the effects of elevation, and, in the 1999-2007 PM 2. 5 and 1988-2007 PM 10 models, of urban land use within 1 km. The robustness of our findings may be due to our covariate selection procedures which were performed using the fully specified spatio-temporal model, allowing for residual spatial trends and changing covariate effects, including potential nonlinearity in those effects, to compete with each candidate covariate, in contrast to approaches where covariate selection is based on multiple linear regression before spatial modeling is performed. Su et al. [49] used a more complicated variable selection approach, but one that may lead to over-fitting and that is not practical for models with large geographic and temporal scopes such as ours, with approximately 125,000-250,000 observations and run times for one model fit of between 24 and 96 hours. Kloog et al. [22] and Sampson et al. [16] have described attractive alternative approaches, which allow for the inclusion of large numbers of covariates while shrinking their effects, but these approaches also increase model complexity and may thus not be practical for models applied to the entire conterminous U.S. that span many years of monthly data.
Spatial trends in long-term (1999-2007) mean PM 2.5 levels from our modeling approach, presented in Figure 2, are broadly similar to those in a recent spatial analysis of annual-average PM 2.5 levels in the year 2000 [16] and to those in our earlier work in the Northeastern and Midwestern US [11][12][13]. It is possible that with additional covariates, such as satellite-derived AOD measures [19][20][21][22][23][24], model predictive accuracy (i.e., CV R 2 ) may improve, especially in areas far from monitors [24]. Although models have been developed that incorporate satellitederived measures, to date there have been limited comparisons to GIS-based spatio-temporal models. For example, Lee et al. [24] used satellite-derived AOD data in combination with a low spatial resolution (2°× 2.5°) global 3-D chemical transport model (GEOS-Chem) to estimate PM 2.5 levels in the conterminous U.S., but compared it to a kriging model without geographic or meteorological covariates that could explain small-scale spatial variability.
Paciorek et al. [18] compared hierarchical spatio-temporal models that included geographic and meteorological covariates with satellite-derived AOD vs. those without, but only in mid-Atlantic region of the U.S., at the monthly time scale, and over one year: 2004. These models had high predictive ability, but inclusion of AOD did not improve predictive accuracy (monthly CV R 2 =0.827 without AOD and 0.825 with calibrated Moderate Resolution Imaging Spectroradiometer or Geostationary Operational Environmental Satellite AOD). More recent studies demonstrate the utility of daily as opposed to monthly satellite-derived AOD measures in New England and the mid-Atlantic states [21,22], reporting yearly CV R 2 values of 0.83 and 0.81, respectively. However, these models cannot be used to predict PM levels before the year 2000, given that they require satellite-derived AOD data that are not available before that time period. Given the air quality monitoring, meteorological, geographic, and other data available from 1988-2007, our modeling approach provides a reasonable balance of computational feasibility (using standard software) and complexity while representing the small-and large-scale spatial, temporal, and spatio-temporal features of the data.