Study design and population
We applied case-crossover analysis to assess the association between temperature and counts of daily hospital admissions and ED visits for heat stress, dehydration, acute kidney failure (AKF), and cardiovascular diseases (CVD) in NYS in the summers of 2008 through 2012.
Health outcomes
Daily summer hospital admissions and ED visits (May – September) in NYS were acquired from the New York State Department of Health’s Statewide Planning and Research Cooperative System (SPARCS) inpatient and outpatient datasets. SPARCS contains billing and medical abstract information for each hospital inpatient stay and outpatient (ambulatory surgery, ED, and outpatient services) visit in the state; except those at Federal and Veterans Administration hospitals [13]. Outpatient visits and ambulatory surgeries were excluded from the CVD analysis as these were unlikely to arise from acute heat exposures. The New York State Department of Health (NYSDOH) Institutional Review Board and Data Protection Review Board approvals were obtained to access individually identifiable information such as address, date of birth, and date of hospital visit. For each outcome, we combined both hospital admissions and ED visits for data analyses due to small sample size. In total, four health outcomes were assessed: heat stress (n = 8703), dehydration (n = 59,828), AKF (n = 50,008) and CVD (n = 845,927). The following international classifications of diseases (ICD, revisions 9) were used for heat stress (ICD-9, 992.0–992.9; E900.0, E900.9), dehydration (ICD-9, 276.51), AKF (ICD-9, 584.5–584.9), and CVD (ICD-9, 390–398, 401–405, 410–417, 420–438, 440–448, 451–459). For acute conditions such as heat stress and dehydration, any recurrences of ED visits and/or hospitalizations within a 1-week period were excluded, with recurrences after that period considered as new events. While for chronic conditions such as AKF and CVD, recurrences within a 28-day period were excluded. Residential addresses obtained from hospital admission records were geocoded at street-level and assigned longitude and latitude coordinates, using NYS Street Address Mapping (SAM) [14], MapMarker® [15], and ArcGIS® [16].
Air temperatures
Reanalysis dataset
The NLDAS dataset integrates a large quantity of observation- based and model reanalysis data executed at a grid surface with ∼14 km (1/8th degree) resolution over North America. The non-precipitation land-surface forcing fields for NLDAS are originally derived from the analysis fields of the National Centers for Environmental Prediction (NCEP) North American Regional Reanalysis (NARR). NARR analysis fields have a 32-km spatial resolution and 3-hourly temporal frequency. The NARR fields that are used to generate NLDAS meteorological fields are spatially interpolated to the finer resolution of the NLDAS 1/8th-degree grid and then temporally disaggregated to the NLDAS hourly frequency [5]. The NLDAS hourly data used in this study were acquired as part of the mission of NASA’s Earth Science Division and archived and distributed by the Goddard Earth Sciences (GES) Data and Information Services Center (DISC). This hourly data slightly differs from the temperature data typically reported from an air monitor station which has only minimum and maximum temperature capabilities.
From the NLDAS hourly data, we derived daily maximum temperature (Tmax), minimum temperature (Tmin), and mean temperature (Tmean) data (all in °F), and a daily maximum heat index (maxHI) product, which reflect the combined effects of heat and humidity [8]. For a description of how the maximum heat index was calculated, please refer to Additional file 1: Appendix A. The NLDAS data consists of 103,936 grid cells covering the entire U.S., excluding Alaska and Hawaii. For this study, 1040 grids covering NYS were used (Fig. 1). In the absence of information on the actual place of exposure, the temperature metrics in this gridded dataset were assigned based on the grid that included the patient’s residential address. This dataset was then spatially merged with the datasets containing fine particulate matter (PM2.5) and ozone estimates using the geographic coordinates of the patients’ residences. Lagged exposure values were created for all models to assess the immediate (on admission day, i.e. lag 0) and delayed (exposure up to 7 days before admission i.e. lag 1–7 respectively) effects of the temperature metrics on the health outcomes. The temperature effect at different lags were modeled separately. Effect of cumulative exposures were assessed using moving averages for multiple lags. We also used a distributed lag structure where multiple lags were included simultaneously in the model.
Monitor dataset
Automated Surface Observing System (ASOS) units are automated sensor suites that are designed to serve meteorological and aviation observing needs [17]. ASOS systems generate reports at hourly intervals and serve as a primary climatological observing network in the United States. We extracted daily temperature data from 32 ASOS units stationed across NYS. We used 100-mile and 20-mile buffer ranges around the 32 ASOS stations, to spatially join the heat stress cases (n = 8799) during the study period (2008–2012) with ASOS stations closest to each observation.
Air pollution data
Daily estimates of PM2.5 and ozone were derived using data from the Environmental Protection Agency’s (EPA) downscaler (DS) model [18]. The DS model fuses output from a gridded atmospheric model with point air pollution measurements. It combines air quality monitoring and modeling data to provide finer-scale predictions of air pollutant levels at local and community levels [19]. We categorized air pollutant estimates as low and high exposures using the 25th percentile and 75th percentile of the data distribution respectively.
Potential effect modifiers
Urban areas are typically warmer than rural areas due to urban heat island effects. However, we were interested in evaluating whether rural areas had a consequently lower risk of heat-health outcomes. The rural-urban commuting area (RUCA) codes which classify U.S. census tracts using measures of population density, urbanization, and daily commuting were used to categorize the rural-urban areas of NYS [20]. The most recent RUCA codes are based on data from the 2010 decennial census and the 2006–2010 American Community Survey. New York City (NYC) is by far the most densely populated, urbanized area in NYS. It also has a more diverse population with socio-demographic characteristics that may not represent other urban populations in NYS. In addition, NYC is serviced by its own local health department. Therefore, it was deemed appropriate to conduct a separate analysis for that area. The rest of NYS was divided into urban and rural census tracts based on RUCA codes.
We also assessed whether effects of heat vary by individual and socioeconomic characteristics. The hospitalization/ED data provided information on age, sex, race and ethnicity. Age was categorized into 6 categories. Race and ethnicity information in the medical records was classified as Hispanic, White Non-Hispanic, Black/Non-Hispanic and Other. To evaluate whether existing chronic illness modified effects of heat among cases with a primary diagnosis code of heat stress and dehydration, we flagged cases with pre-existing diabetes, chronic kidney disease, heart failure, other heart disease, hypertension, chronic liver disease, acute renal disease, acute myocardial infarction and atherosclerosis in “Other Diagnoses” fields.
Statistical methods
We conducted a case-crossover analysis using a semi-symmetric bidirectional, time-stratified design to assess the effect of summertime temperature on hospitalizations and ED visits in NYS. This method compares the temperature metrics on the day of hospital admission /visit (case/exposure day) with the temperature metrics, before and/or after (control period), within the same pre-specified stratum window of time, when the subject is not hospitalized or in the ED [21]. Since each case serves as his or her own control on all measured and unmeasured subject non-time-varying characteristics, confounding factors such as individual demographic and social factors have been adjusted by design [22]. A one-month stratum window of exposure was used to compare cases with the control (referent) period of ±7, ±14, or ± 21 days from case day within the same month, thereby providing up to four control dates per case. The season and day of the week variations were controlled for by restricting the referents to the same day of the week, month and year as the index day [23]. We controlled for time varying variables such as PM2.5 and ozone in the model.
The initial analysis was conducted in °Fahrenheit to conform to NWS criteria for issuance of heat warning. In this manuscript, all changes in risk based on temperature are reported in S.I. units (°C) to allow for comparison with previous research. Although analyses were carried out for maximum temperature, minimum temperature, mean temperature and maximum heat index, we focus on maximum heat index which is widely used by the NWS and heat-health researchers in the United States.
Heat-health associations and effect modification
We initially calculated risk ratios for a 1 °C change in temperature metric using conditional logistic regression analysis for each health outcome. Effect modification for age, sex, race/ethnicity, rural/urban areas, and month of exposure were determined a priori based on literature review and assessed using multiplicative interaction terms. The joint effect of ozone and PM2.5 on heat-health were also evaluated using interaction terms. There has been some debate about the adjustment for air pollution when assessing risk of health effects of heat [24, 25]. Therefore, we present results with and without adjustment for ozone and PM2.5. We assessed rural-urban differences in heat health associations. We assessed effects separately by NYC, non-NYC urban and non-NYC rural. Finally, we evaluated effect modification for effects on heat on heat stress and dehydration in individuals with pre-existing comorbidities using a stratified analysis. We conducted a supplementary analysis using monitor data linked to health observations using 20- and 100-mile buffer zones to compare with estimates derived from the spatially contiguous reanalysis data. We repeated the case-crossover analysis using monitor data and compared results with those from reanalysis dataset.
Heat-health threshold analysis
To establish regional heat advisory thresholds that are relevant for heat-health associations in the region, we used a piecewise linear spline regression to assess the shape of the temperature-outcome association. Knots defining slope changes were sequentially selected at 5-degree intervals. We calculated three separate trigger points as suggested by recent research [11, 26]. The minimum risk temperature (MRT) for heat stress was determined as the lowest temperature at which the health outcome was observed during the study period. For all other health outcomes, the MRT was defined as the lowest temperature above which a consistent increase in the relative risk was observed. The excess risk temperature (ERT) was defined as the lowest temperature above the MRT at which the lower bound of the 95% confidence interval of relative risk of a particular health outcome was greater than 1. We assessed the risk ratios at the existing NWS criteria of 37.8 °C (100 °F) and at the proposed criteria of 35 °C (95 °F).
All statistical analyses were carried out using SAS™ statistical software Version 9.4. (SAS Institute Inc., Cary, NC, USA) and R. Geographic Information System (GIS) analysis was carried out using SAS and ArcGIS® [16].