This study received approval from the Centers for Disease Control and Prevention’s (CDC) Institutional Review Board.
Study population
Since computerized birth certificates in North Carolina became available in 1968 and the contaminated wells on base were shut down in 1985, we included live singleton births 28–47 weeks gestation weighing ≥500 grams that occurred between 1968 and 1985 to mothers who lived at Camp Lejeune at delivery [8]. By cross referencing birth certificate data for Onslow County with Camp Lejeune housing records, we identified 11,896 births that met these criteria.
Data collection
Outcomes of interest in this study were preterm birth and fetal growth retardation as measured by reduced MBW, TLBW, and SGA; data regarding these outcomes were obtained from birth certificates. Preterm births were defined as births occurring at less than 37 weeks of gestation. Gestational age was calculated using date of mother’s last menstrual period (LMP) from the birth certificate. TLBW was defined as full-term babies (≥37 weeks gestation) weighing <2,500 grams at birth. For SGA births, three categorizations were evaluated: births weighing <5th and <10th percentiles based on sex- and race-specific weight by gestational week norms from New Jersey and births weighing less <10th percentile based on sex-specific growth curves for California [4, 9]. The New Jersey norms were determined using race- and sex-specific birth weights by gestational weeks for all singleton white and African American births in the state during 1985–1988. The California norms were based on sex-specific growth curves for white singleton births in the state from 1970–1976.
Consent
Informed consent was not obtained from participants because this was a data-linkage study that did not involve contact with participants.
Exposure assessment
To assign exposures, we used address information collected from birth certificates, base family housing records, and water modeling results. Each month of residence was linked to estimated levels of contaminants in drinking water serving that location. We examined the following time periods: each trimester and the entire pregnancy. For each time period examined, births were categorized as unexposed if mothers did not reside at Camp Lejeune, if their residence at Camp Lejeune received uncontaminated drinking water, or mothers were exposed for <1 week during that time period. A birth could be unexposed in the analysis of one trimester but categorized as exposed in the analysis of a different trimester. However if a birth was exposed in any trimester, then the birth was categorized as exposed in the analysis of the entire pregnancy.
Due to a lack of historical, contaminant-specific data, we conducted a historical reconstruction of contaminant levels in drinking water at Camp Lejeune. Modeling provided monthly average estimates of concentrations of contaminant-specific compounds in drinking water delivered to residences. The water modeling used extensive hydrogeological information as well as information on the sources of pollution, well pumping schedules, and the water distribution system of each of the treatment plants. Detailed information pertaining to the historical reconstruction was published in peer reviewed reports [1, 2].
Data analysis
We used unconditional logistic regression in SAS 9.3 to individually compare the odds of preterm birth, TLBW, and SGA among the exposure categories [10]. We used linear regression in SAS 9.3 to compute MBW differences as indicated by the β coefficient. Reduced MBW among full-term babies was evaluated as a continuous variable by comparing birth weight differences by exposure categories. Unadjusted and adjusted odds ratios (ORs) and βs and their 95% confidence intervals (CIs) were calculated. We compared adjusted models to unadjusted models. In these comparisons, the unadjusted models only included births with complete data for the risk factor(s).
The following risk factors ascertained from birth certificates were evaluated for confounding: mother’s race, prenatal care, age of mother and father, parity, educational level of mother and father, sex of child, and if the mother had a previous fetal death. “Adequate” prenatal care was assigned based on the Kessner index, which uses start of prenatal care, number of prenatal visits, and duration of pregnancy to determine adequacy [11]. We also evaluated military rank (obtained from the family housing records) as a potential risk factor; rank was a surrogate measure of socio-economic status. If any potential risk factors were highly correlated, we evaluated the risk factor that was more strongly associated with the outcome. Each risk factor was included in a model with the exposure variable; if adjusted results differed from unadjusted results by >10%, the risk factor was selected as a potential confounder [12].
After all selected potential confounders were included in a model, a final model was determined using a backwards elimination process. Order of the elimination was determined by removing the potential confounder with the value closest to the null for the association between the confounder and the outcome and continuing until no factor could be removed without changing the estimate for the drinking water exposure by >10%. If there was no confounding by the risk factors, unadjusted models were presented.
We used two criteria to assess associations: (1) magnitude of the OR or β and (2) the exposure-response relationship, emphasizing monotonic trends in categorical exposure variables. A monotonic trend occurs when every change in the OR or MBW difference with increasing category of exposure is in the same direction, although the trend could have flat segments but never reverse direction [13]. Confidence intervals were only used to indicate the precision of the estimates [14–16]. We included p-values in tables for information purposes only. We did not use statistical significance testing to interpret findings [13, 15, 16].
For the primary analyses, exposure to each contaminant was evaluated separately. Exposure variables were categorized such that the reference group did not have residential exposure to the contaminant under evaluation (“unexposed”). For all contaminants except benzene, the exposed group was divided into four levels: < median value, ≥ median value, ≥75th percentile, and ≥90th percentile. Due to sparse data, those exposed to benzene were categorized into two levels: <1 part per billion (ppb) and ≥1 ppb. We analyzed average monthly concentration levels in the drinking water during each pregnancy trimester as well as during the entire pregnancy. Tables present results for average monthly concentration levels during the entire pregnancy. Trimester-specific results are provided in additional files (see Additional file 1: Table S1-S4). We mention in the text if results of specific trimesters differ from those of the entire pregnancy either because the magnitude of the association is different and/or there is an exposure-response relationship observed in a specific trimester that is not observed for the entire pregnancy.
Four types of secondary analyses were conducted. First, to obtain a visual characterization of the relationship between each outcome and average monthly concentration levels during each pregnancy trimester as well as during the entire pregnancy, we used a SAS macro to include a restricted cubic spline (RCS) function for the exposure as a continuous variable in the logistic and linear regression models [17]. Three knots were located at the 5th, 50th, and 95th percentiles of the average monthly exposure variable. (Because of sparse data, the knots for benzene could not be spaced symmetrically; instead, knots were located at the 10th, 75th and 95th percentiles.) The RCS function allowed the shape of the curve to vary within and between these knots and restricted the curve to be linear before the first knot and after the last knot. The resulting curve is useful for assessing whether the exposure-response relationship is adequately captured by the categorical exposure variables.
Second, to take into account correlations among births contributed by the same mother, we conducted generalized estimating equations (GEE) modeling using an exchangeable correlation structure. To identify mothers who contributed more than one singleton birth, it was necessary to match on mother’s name. However mother’s first name was missing for over one-third of the births, so it is likely that some mothers who contributed more than one birth were not identified. A total of 1,330 births (11.2% of births in the study) were identified among 646 mothers who contributed more than one singleton birth during the study period.
Third, when two contaminants were independently associated with an outcome, both contaminants were included in a model to determine which had the stronger association. Finally, analyses were conducted using those without residential exposure to any of the drinking water contaminants as a reference group.