We used GIS-based indicators of local pollution sources and topography to systematically allocate 36 air monitoring sites across the metropolitan area during two seasons – June-August 2011 (summer) and January-March 2012 (winter). The same sites were repeated in each season, within which monitors were distributed between six 5-day weekday sessions in each season. We collected integrated samples of criteria pollutants, and derived seasonal averages using two reference monitors – one urban and one regional background. We tested the hypothesis that lower-elevation areas may experience higher pollution concentrations under inversion conditions[20, 32], and that these effects may vary by season.
Sampling instrumentation and laboratory analyses
We collected integrated samples of nitrogen dioxide (NO2), ozone (O3), sulfur dioxide (SO2), fine particulate matter (PM2.5) black carbon (BC), and constituents using portable ambient air sampling units originally designed for the New York City Community Air Survey[7]. Particle sampling instruments include a Dual Stage PM2.5 Harvard Impactor (Air Diagnostics and Engineering Inc.) with particulate matter collected onto 37 mm Teflon filters (PTFE membrane, 2 μm pores, Pall Life Sciences), a HOBO data logger for relative humidity, temperature, and barometric pressure readings (Onset Computer Corporation). Battery-operated vacuum pumps (SKC, Inc.) moved ambient air through particle filters at a constant rate of 4 liters per minute, and pre - and post-flow rates were recorded for data quality assurance. Passive gaseous samplers (Ogawa & Co. USA) were placed into weather tight shelters on the exterior of sampling units. Sampling instruments were housed in weather tight boxes, and mounted 3-4 meters above ground on utility poles, near the breathing zone.
PM2.5 and BC were measured solely during weekday morning rush hours and potential inversion hours, using a chrontroller (ChronTrol Corporation) to program the sampling units to simultaneously sample all locations (including reference sites) each weekday (Monday-Friday) from 6:00 AM to 11:00 AM. Deployment and retrieval schedules were aimed at minimizing differences in exposed time for passive badges between monitors and across sessions.
Teflon filters were pre- and post-weighed at the University of Pittsburgh, Department of Environmental & Occupational Health, in a temperature and relative humidity-controlled glove box (PlasLabs Model 890 THC) using an ultra-microbalance (Mettler Toledo Model XP2U) for total PM2.5 mass, and reflectometry for BC absorbance was performed using the EEL43M Smokestain Reflectometer (Diffusion Systems). Ogawa passive badges were analyzed at the University of Pittsburgh, Department of Geology & Planetary Sciences using water-based extraction and spectrophotometry (Thermo Scientific Evolution 60S UV-Visible Spectrophotometer) for NO2 ppb concentration. SO2 and O3 sample analyses are ongoing, and we do not report their results here.
Quality assurance and controls
To account for possible contamination, we used one laboratory blank and multiple field blanks each session for gases and particles, and co-located paired distributed monitors at four randomly-selected sites during one sampling session each season. PM2.5 pump flow rates were calibrated to 4.0 liters per minute (LPM) (temperature-adjusted based on weather forecasts) prior to deployment, and compared to post-collection rates. We verified program completion for each sampler run using the sampling unit program log.
Summer sampling was performed from July 25 to September 9, 2011 (the week of August 29 skipped for logistical reasons), and winter sampling from January 16 to February 24, 2012. Across seasons, all PM2.5 samples met acceptable pre- and post-collection flow rate (within 5% of 4.0 LPM). Instrumentation failure occurred at only one site, which was re-sampled during a later session. Co-located measures of PM2.5 and NO2 were highly correlated (rho = 0.93 and 0.97, respectively) across four monitoring locations. Field blanks for PM2.5 and NO2 ranged from 0.07-1.50 μg/m3 and 0.01-0.05 ppb, respectively, and were similar across seasons. Pollutant concentrations were field blank-corrected. Data completeness was 100% for PM2.5, NO2, and BC, with no statistical outliers (outside of mean +/- 3 standard deviations).
Study domain selection and characterization
We aimed to capture large industrial point sources, major roadways, and river valleys across an urban-to-suburban gradient of Allegheny County, within a feasible coverage area, extending at least 10 km Northeast of industrial point sources, with respect to the prevailing wind direction (West/Southwest). In a GIS, we fit a polygon to meet coverage and distance criteria, and selected intersecting contiguous census tracts, to enable subsequent merging of population indicators. Our domain stretched northwest of downtown Pittsburgh along the Ohio River, and southeast along the Monongahela River, covering approximately 500 km2, including 258 contiguous census tracts within Allegheny County, PA (Figure 1), and captured wide variability in population density: from 272 to 55,343 residents per km2[24]. Large industrial point sources within our domain include two coke smelting works (Neville Island and Clairton) and a steel mill (Braddock).
For purposes of sampling site selection, we explored spatial variability across a range of local source indicators, and potential modifiers of source-concentration relationships. Based on recent source apportionment of PM2.5 measurements collected at Allegheny County Health Department (ACHD) regulatory monitors, which attributed the majority of measured fine particles to local industrial and mobile sources[19], we developed GIS-based indicators of local industrial emissions and on-road vehicle traffic. Because traffic-related pollution varies within 50-200 m from roadways[33, 34], and because of steep elevation gradients in the Pittsburgh area, we used relatively small regular 100 m2 lattice grid cells to characterize the study domain according to three key local pollution indicators: (a) traffic density, (b) emission-weighted proximity to industrial point sources, and (c) topography. GIS-based analysis and mapping were implemented in ArcInfo, v10 (ESRI, Redlands, CA).
We evaluated multiple indicators of traffic emissions (e.g., proximity to roadways, heavy track traffic), and decided on the most inclusive indictor – total on-road traffic density – to prevent biasing our study design toward one class of vehicle emissions. First, we created road-segment counts by summing total vehicles on major road segments plus an estimated 500-vehicle count on minor road segments (based on major road count distribution), using Pennsylvania Department of Transportation Annualized Average Daily Traffic (AADT) counts (2011)[35]. Using ArcInfo’s Spatial Analyst toolbox, we derived a continuous kernel traffic density surface by applying a Gaussian decay function to traffic counts on all road segments within our domain. From this traffic density surface, we calculated mean traffic density within each 100 m2 grid cell.
We created a multi-pollutant indicator of industrial emissions to prevent biasing our sampling design toward one pollutant or industry type. Using emissions data from the U.S. Environmental Protection Agency’s National Emissions Inventory[36], we first summed emissions mass in tons of multiple pollutants PM2.5 (filterable and condensable), nitrogen oxides (NOX), sulfur dioxide (SO2), and volatile organic compounds (VOCs) – from reporting facilities in Allegheny County, PA. We then used inverse-distance interpolation to calculate an emission-weighted proximity to industry indicator for each 100 m2 grid cell centroid, drawing emissions information from facilities within an 80 km radial buffer threshold. Inverse-distance interpolation weights emissions values at locations in between facilities as a function of distance, such that relatively near facilities will have a greater influence than far facilities on local air quality.
As there is no standard metric to demarcate ‘valley’ versus ‘non-valley’ areas, we opted to use continuous elevation above sea level to maximize spatial resolution and comparability with previous LUR studies[8, 37–39]. We calculated mean elevation within each 100 m2 grid cell from the U.S. Geological Survey National Elevation Dataset 30 m2-resolution raster data set[40]. Across sampling locations, elevation is correlated with distance-to-river-centerlines at rho = 0.67, supporting our interpretation of elevation as an indicator of river valleys, where cool air pools may exacerbate inversion formation. Furthermore, in our pilot mobile monitoring study, we found a strong relationship between elevation, atmospheric inversions, and PM2.5 and PM10 concentrations in one relatively low-lying Pittsburgh community (Braddock, PA)[32].
Distributed site selection & allocation
Across our study domain, the distribution of source indicators used for sampling site selection – traffic density, emission-weighted proximity to industrial facilities, and elevation – varied substantially (Figure 2); source indicators were not collinear (rho = -0.08 to -0.21, across all 100 m2 grid cells). We dichotomized each source indicator at the 70th percentile, and cross-stratified each 100 m2 grid cell across eight classifications, representing combinations of ‘high’ and ‘low’ source profiles (e.g., ‘low’ traffic density, ‘near’ industrial sources, low elevation ‘valley’). This dichotomization point was chosen based on left-skewed distribution of source indicators, to systematically over-sample hypothesized high-pollution areas.
We used stratified random sampling without replacement to select 30 spatially-distributed monitoring grid cells across eight source indicator cross-strata, using Geospatial Modeling Environment software, v 0.7.2.0 (Spatial Ecology, LLC). Six additional grid cells were selected to fill spatial gaps in the periphery of our large domain. Specifically, three 30 km2 areas in which no cells had been allocated were selected in GIS, and two cells randomly selected from each. Rivers and riverbank areas (<20 m from a river’s edge) were not eligible for sampling site selection, for logistical reasons. Sample size was determined by available resources, domain size, logistical limitations, and precedent of 40 monitoring sites for urban LUR modeling[41, 42]. Figure 2 shows spatial allocation of distributed and reference monitoring sites, which were repeated in summer and winter.
Suitable locations to mount sampling units (e.g., utility or telephone poles) were identified near the centroid of selected 100 m2 grid cells by field teams, using consistent protocols. Mounting pole eligibility criteria included: no obstructions within 3 m of the monitors, street accessible, three or more meters from buildings, identifiable pole ownership (to obtain permissions), away from bus stops, and without overhanging tree branches. Latitude and longitude coordinates of selected mounting poles were pinpointed using GPS (Colorado® 400 t, Garmin), and verified in Google Earth™. A detailed site survey was conducted for each sampling location, to document relevant information potentially unavailable in GIS datasets (e.g., construction). Permissions to mount monitors on utility poles were obtained from Duquesne Light Co., Verizon, Inc., Allegheny County Parks Department, and the City of Pittsburgh Department of Public Works.
As sampling at the 36 sites was evenly allocated across six Monday-through-Friday sampling sessions (six sites sampled per session), we sought to balance source indicator strata and spatial distribution across sessions to avoid confounding spatial and temporal patterns in pollution concentrations. For each session, we used traffic density, the most spatially dispersed indicator, to draw a stratified-random sample (without replacement) of six sites (e.g., randomly allocate 3 ‘high’ and ‘low’ traffic density sites per session). Because pollution source and topography indicators may be spatially clustered in Pittsburgh (i.e., industrial facilities located in low-elevation river valleys and/or near highways), we required spatial representation of four regions of our domain (i.e., east and west banks of the Monongahela River, northeast and southeast of downtown) within each session. Temporal allocation of sites across sessions was the same during winter and summer sampling seasons.
Reference monitors and temporal adjustment
We designated two reference sites, which were sampled during all sessions to provide information on overall temporal trends in air quality. First, an upwind reference site (Regional background site – Figure 2) located in a relatively rural area west of our domain, in Settlers Cabin County Park, Oakdale, PA, would provide information on regional background air quality. Second, a relatively urban reference site (Urban background site – Figure 2) within our domain, in Braddock, PA, was selected for comparison. The urban reference site is located in a low-elevation area, to capture topography-related inversion effects in seasonal air quality trends. We compared ACHD regulatory monitoring data to the weekly temporal patterns in NO2 and PM2.5 measured at study reference monitors, and found variable correlation between both reference monitors with ACHD monitors (Spearman rho from -0.71 to 0.90 (mean = 0.23)). Figure 3 plots weekly PM2.5 and NO2, trends across ACHD regulatory monitors and study reference monitors (regional and urban background); regional and urban reference trends are variably correlated in both seasons (Spearman rho 0.04 to 0.91). As expected, regional background concentrations were consistently lower than urban reference site measurements, and lower than ACHD regulatory monitors. This difference is larger for NO2 in both seasons, compared to PM2.5.
To facilitate comparison between site-specific concentrations, collected during one of six sampling sessions, we apply a temporal adjustment to adjust distributed site samples for between-session variability primarily driven by time-varying meteorology or long-range transport, and to derive seasonally representative mean values. Specifically, to estimate the expected, seasonally representative concentration at a given site – as if it had been sampled during an “average” week – the observed concentration is multiplied by the ratio of the seasonal average reference concentration, and then divided by the session-specific reference concentration. As such, it is the relationship of the session relative to the seasonal average at the reference site(s) that determines the temporal adjustment, which can therefore adjust distributed concentration to both lesser and greater values. These adjusted seasonal mean values allow for examination of spatial source-concentration relationships, with reduced influence of time-varying factors (i.e., meteorology, long-range transport). Because the appropriate reference trend for temporal adjustment may vary by season and/or pollutant, we evaluate two methods: one using the only the regional background reference trend (Equation 1), and a second using the mean trend of the urban and regional background sites (Equation 2). Both of these approaches have been successfully applied in other studies of intra-urban air quality variability[7, 16].
Within each season, sampled pollutant concentrations were temporally adjusted as:
(1)
(2)
where [adjConc]ij is the temporally-adjusted pollutant concentration at monitoring site i during sampling session j, [Conc]ij is the pollutant concentration at monitoring site i during sampling session j, [RefRegional]j is the regional background reference site concentration during sampling session j, [Ref
μ(Regional+Urban)]j is the mean concentration of the regional background and urban reference sites during sampling session j, [RefRegional]Season is the seasonal average regional background reference site pollutant concentration, and [Ref
μ(Regional+Urban)]Season is the mean seasonal average pollutant concentration of the regional background and urban reference sites.
Temperature inversions and meteorology
We identified probable morning inversion hours as 6:00-11:00 AM by examining: (a) meteorological sounding data, (b) hourly ACHD regulatory monitor data, and (c) pilot mobile monitoring study data[32]. We used meteorological sounding data (i.e., Skew-T diagrams) recorded daily at 7:00 AM from the Pittsburgh International Airport, approximately 25 km Northwest of downtown Pittsburgh (Figure 1), to identify lapses in the vertical temperature gradient characteristic of inversion events. To confirm the number of inversion hours overlapping with sampling intervals (6:00-11:00 AM), inversion hours per event were evaluated using Bufkit 10.11, a forecast profile visualization and analysis software developed by the National Oceanic and Atmospheric Association (NOAA) and National Weather Service[32]. Inversions were defined as two or more hours of inverted temperature gradient during sampling hours. Inversion frequency was operationalized as number of inversion mornings per sampling session (1-4), and as a binary indicator (fewer than 3, vs. 3 or more days per session), based on overall frequency distribution. Importantly, these characterizations are regional scale, and do not reflect the complex interactions between topography, surface thermal variability in urbanized areas (i.e., urban heat island effect), and pollution.
Wind speed and direction influence local pollution concentrations through horizontal advection, however, the metrics that can elucidate spatial gradients in these processes are not well specified[43]. Wind speed and direction data measured at NOAA’s weather station at the Pittsburgh International Airport (and obtained from NOAA’s online National Climatic Data Center) were clipped to each sampling session, and used to generate wind rose diagrams (using Lakes Environmental WRPLOT View freeware) to examine within and between session variability. We then determined dominant wind direction and average wind speed (from any direction) for each sampling session. We compared wind speed and direction on inversion versus non-inversion mornings, in each season, to better understand the relationship between inversion conditions and local pollutant concentrations.
Statistical analysis
We calculated descriptive statistics for PM2.5, BC and NO2, during each season, to identify potential outliers, and to compare temporally adjusted values by method (i.e., Regional-only vs. Urban + Regional). We examined pollutant concentration distributions across pollution indicator strata used for site selection and allocation: traffic density, emission-weighted proximity to industry, and elevation above sea level using Spearman correlation analysis, to account for non-normal distribution of pollution concentrations. We examined between-season differences using paired t-tests on log-transformed (base 10) concentrations, to account for non-normality of distributions, and compared results across temporal adjustment methods. We examined the relationship between log-transformed pollutant concentrations and inversion frequency, by elevation and temporal adjustment method. Further analyses of meteorological factors examined associations between temporally adjusted pollutant concentrations and within-session average wind speed (continuous and binary (median-stratified) measures), and dominant wind direction (e.g., West, Northwest). Statistical analyses were performed in SAS, v 9.2 (Cary, NC) and R statistical software v 2.12.1.