Study population and locations
Data from all births in the US were obtained from the National Center for Health Statistics (NCHS, https://www.cdc.gov/nchs/data_access/vitalstatsonline.htm) for the years 1982–1988, years for which location and exact birth date are publicly available. A total of 25,328,335 births occurred during 1982–1988. We selected births from the 50 most populous metropolitan statistical areas (MSA), accounting for 52% of all births in the US. Delineation of MSAs was based on the 2010 Census. The 50 MSAs contained 416 counties. 20% of births were in the two most populous MSAs (Los Angeles-Long Beach-Santa Ana, CA and New York-Northern New Jersey-Long Island, NY-NJ-PA); because of the disproportionate number of births in those two MSAs, we split them into smaller areas. Los Angeles-Long Beach-Santa Ana, CA was split into two parts (Los Angeles county and Orange county, Figure S1), and New York-Northern New Jersey-Long Island, NY-NJ-PA was split into three parts (Long Island areas, New York City area, and other New York-New Jersey counties, Figure S2). Thus, there were 53 MSA/sub-MSAs (Fig. 1) included in the study; throughout, we refer to these MSAs and sub-MSAs as the “study locations.” The study protocol was approved by the University of Nevada, Reno Institutional Review Board (IRB#: 1164285–8).
Exclusion criteria
Fixed cohort bias can occur in retrospective birth cohort studies that define inclusion based on birth dates within a fixed start and end date, which means shorter pregnancies are missed at the start of the study, and longer pregnancies are missed at the end [33]. To avoid this bias, we defined our study population by conception dates and limited our study population to women whose last menstrual period (LMP) dates were between October 1, 1981 and February 29, 1988 to ensure that both the shortest and the longest pregnancies were included. For comparison with previous studies, we focussed our analysis on the warm season [9, 16]. Analogous to the fixed cohort bias described above, selection bias might also occur in this setting when shorter or longer pregnancies, which could have been exposed to heatwaves and were at risk for preterm birth, are excluded as a consequence of limiting the analysis to babies born in the warm season. To prevent this potential bias, we included all births whose at-risk window (28 weeks and 0 days − 36 weeks and 6 days for preterm and 37 weeks and 0 days-38 weeks and 6 days for early-term) overlapped with the warm season (May 1 – September 30) for at least 1 day (calculation illustrated in Figure S3). This corresponded to yearly conception dates between August 16 and March 18 (of the following year) and August 2 and January 14 (of the following year) for preterm and early-term birth, respectively (selection process and study flow diagram illustrated in Figure S4). In addition, we limited the analysis to singleton births.
Meteorological data
We obtained 1 km × 1 km gridded estimates of daily meteorological parameters from Daymet [34] supported by NASA through the Earth Science Data and Information System (ESDIS) and the Terrestrial Ecology Program. Daymet uses surface meteorological observations in a spatial interpolation algorithm that also incorporates elevation, solar radiation, and precipitation factors to provide the gridded estimates. It provides seven surface weather parameters (minimum temperature (Tmin), maximum temperature (Tmax), precipitation, shortwave radiation, vapor pressure (VP), snow water equivalent, and day length) for North America. For this study, we calculated county-level daily Tmin, Tmax, and VP by averaging all grid estimates within a county in the US for each day between January 1, 1982 and December 31, 1988. Daily mean temperature (Tmean) was the mean of daily Tmin and Tmax. Because relative humidity is a qualitative indicator of moisture in the atmosphere that changes with ambient temperature, we instead use dew-point temperature, which is a quantitative measure of the amount of moisture. To provide a humidity value, dew-point temperature was calculated from Tmean and VP [35, 36]. We also calculated apparent temperature (AT) metrics including ATmin, ATmax, ATmean as follows [30]:
$$ \mathrm{AT}=-2.653+0.994\times {T}_a+0.0153\times {T}_d^2, $$
(1)
where Ta is ambient temperature (minimum, maximum, or mean) and Td is mean dew-point temperature.
For study locations comprised of more than one county, we calculated daily meteorological variables for each location by linking county-level population size obtained from the 1980 Census (https://www2.census.gov/programs-surveys/popest/tables/1980-1990/counties/totals/e8089co.xls) to the meteorological data and calculating population-weighted mean of the county-level estimates within a location.
Heatwave definitions
We used two major heatwave definition frameworks (Fig. 2) from the existing literature to identify hot days in each location over the study period. One framework (Arrow 1, Fig. 2) is referred to as the relative temperature threshold. Because climate differs across study locations, we set temperature thresholds specific to each study location [37]. Since this was a national study with a relatively large sample size, we were able to examine the impact of more extreme heatwave events. Therefore, we used the 97.5th percentile [20, 38,39,40] (denoted as T97.5) as the cut-off value for six temperature metrics: Tmin, ATmin, Tmax, ATmax, Tmean, and ATmean.
The other framework (Arrow 2, Fig. 2) is the excessive heat factor (EHF) [25], which is derived from the temperature metrics using formulas 2–4. In addition to performing these calculations using mean air temperature, we also performed them using Tmin, ATmin, Tmax, ATmax, and ATmean. We used the relative threshold recommended by Nairn and Facett to define extreme heatwave (85th percentile of all positive EHF, denoted as EHF+ 85). Specifically, the calculations are given as
$$ EH{I}_{sig}=\frac{T_i+{T}_{i-1}+{T}_{i-2}}{3}-{T}_{95}, $$
(2)
where \( \frac{T_i+{T}_{i-1}+{T}_{i-2}}{3} \) is the 3-day average of daily temperature and T95 is the 95th percentile (location-specific) of daily temperature for the climate reference period 1982–1988;
$$ EH{I}_{accl}=\frac{T_i+{T}_{i-1}+{T}_{i-2}}{3}-\frac{T_{i-1}+{T}_{i-2}+\cdots +{T}_{i-30}}{30}, $$
(3)
where EHIaccl is the difference between 3-day average temperature and 30-day average temperature; and
$$ EHF= EH{I}_{sig}\times \max \left(1, EH{I}_{accl}\right), $$
(4)
where the unit of EHF is °C2.
Heatwave indicators were linked to women based on their MSA codes and gestational exposure window. Using the above 12 indicators for hot days (6 temperature metrics × 2 thresholds: T97.5 or EHF+ 85), we operationalized heatwave exposures in each location on each day in three different ways (Fig. 2):
-
1)
HW1: focuses on the total number of single hot days in the previous week (sum of the number of days exceeding the threshold (ranging from 0 to 7), categorized as 0, 1, 2, and 3 or more days, 3–7 days were grouped together to gain more precision;
-
2)
HW2CX: emphasizes the impact of sustained heat, indicator variables for ≥x consecutive (denoted as subscript CX) hot days in the past week (choices of x are 2, 3, or 4; separate models were used to estimate associations with birth outcomes);
-
3)
HW3: area under the curve (AUC) during the previous week is a continuous metric incorporating both intensity and duration, calculated by subtracting the T97.5 threshold value from the daily temperature (or the EHF+ 85 value from the daily EHF) and averaging these daily differences over the last week. If the 7-day average was less than 0, the AUC was set to 0. AUC values were analyzed as a non-negative continuous variable.
Statistical analysis
We used a 1:1 matched case-control design with cumulative sampling [41], which enabled us to align the exposure window in the case (the 7 days preceding birth) to the corresponding gestational week in the control. We selected controls from births that were > 36 weeks and > 38 weeks by the time of delivery for preterm and early-term birth cases, respectively. Matching factors were maternal race/ethnicity and education, which are strong predictors of preterm birth and early-term birth, as well as location. Since all subjects in the same location would be assigned the same exposure on a given day, the exposure contrasts in the study were temporal. Individuals with missing information on these factors (maternal race/ethnicity and education) were also matched; all individuals included in the analysis had complete data for location and gestational age. Odds ratios (OR) for HW1, HW2c2, HW2c3, HW2c4, and HW3 were estimated using conditional logistic regression adjusting for maternal age (≤20, 21–34, and ≥ 35 years) [14, 21], marital status (married and unmarried), and LMP month and year (Eq.5). To retain every sampled pregnancy in the analysis, we created an indicator variable for missing values of marital status (0.03% of the sample). The exposure period for each case was the 7 days leading up to birth; the same 7-day gestational period was used as the exposure period for each matched control. Detailed model specifications for each heatwave indicator are presented in Table S1. HW1 was modeled as a categorical variable (0 (reference), 1, 2, and 3 or more days (Table S1); to test for a linear trend, these categories were modeled using a linear term (Table S1); HW2CX was modeled as a binary variable (1 = there was a ≥ X (where X = 2, 3, or 4) consecutive-day heatwave in the past week, 0 = otherwise, Table S1); and HW3 was modeled as a continuous variable (°C for temperature or °C2 for EHF, Table S1) .
To examine possible effect modification by maternal characteristics on the associations of heatwaves with preterm and early-term births, we conducted analyses stratified by maternal education (< 12 years, 12 years, and ≥ 13 years), maternal race/ethnicity (Hispanic, non-Hispanic white, and non-Hispanic black), child sex (male and female), and region (Northeast, Midwest, South, and West) and tested interaction by adding an interaction term with heatwave indicators to our primary model (Table S1). To examine the robustness of our results, instead of matching on location and adjusting for LMP month and year (as was done in the primary analysis), we instead matched on LMP month and year and adjusted for location. In addition, we conducted a sensitivity analysis restricted to complete observations (i.e., excluding observations missing matching factors or covariates). All statistical analyses were conducted using SAS (SAS version 9.4; SAS Institute Inc., Cary, NC).