 Research
 Open Access
 Open Peer Review
 Published:
Statistical methods for modeling repeated measures of maternal environmental exposure biomarkers during pregnancy in association with preterm birth
Environmental Health volume 14, Article number: 9 (2015)
Abstract
Background
It is of critical importance to evaluate the role of environmental chemical exposures in premature birth. While a number of studies investigate this relationship, most utilize single exposure measurements during pregnancy in association with the outcome. The studies with repeated measures of exposure during pregnancy employ primarily crosssectional analyses that may not be fully leveraging the power and additional information that the data provide.
Methods
We examine 9 statistical methods that may be utilized to estimate the relationship between a longitudinal exposure and a binary, nontimevarying outcome. To exemplify these methods we utilized data from a nested case–control study examining repeated measures of urinary phthalate metabolites during pregnancy in association with preterm birth.
Results
The methods summarized may be useful for: 1) Examining sensitive windows of exposure in association with an outcome; 2) Summarizing repeated measures to estimate the relationship between average exposure and an outcome; 3) Identifying acute exposures that may be relevant to the outcome; and 4) Understanding the contribution of temporal patterns in exposure levels to the outcome of interest. In the study of phthalates, changes in urinary metabolites over pregnancy did not appear to contribute significantly to preterm birth, making summary of average exposure across gestation optimal given the current design.
Conclusions
The methods exemplified may be of great use in future epidemiologic research projects intended to: 1) Elucidate the complex relationships between environmental chemical exposures and preterm birth; 2) Investigate biological mechanisms in prematurity using repeated measures of maternal factors throughout pregnancy; and 3) More generally, address the relationship between a longitudinal predictor and a binary, nontimevarying outcome.
Background
Preterm birth, defined as delivery before 37 weeks completed gestation, is both a significant public health problem and a multifactorial disease [1]. In attempt to identify predictive markers, underlying causes, and/or mechanistic pathways, many research projects have investigated the contribution of various maternal or fetal factors during pregnancy in relation to risk of prematurity. One example is the investigation of biomarkers of inflammation; Creactive protein, interleukins (e.g., IL6), matrix metalloproteinases, and angiogenic factors measured in maternal blood, amniotic fluid, or various other matrices during pregnancy have been explored with rigor as predictors of preterm birth [2–4]. Additionally, a number of studies have examined biomarkers of environmental chemical exposures which may be important contributors to prematurity [5].
A majority of these studies have utilized measurements from one time point during pregnancy [5]. However, there is additional utility in having multiple measures across gestation. For markers with poor reproducibility within subject, having multiple measures of exposure may give a more accurate representation of average exposure and confer greater power for detecting an association with preterm birth. For markers with good reproducibility within subject, having multiple measures may provide additional information about the relationship between exposure and disease, if the relationship does vary across time. For example, they could provide information about windows of vulnerability during gestation [6].
Methods for examining data with a longitudinal exposure and nontimevarying outcome do not fall in the standard realm of generalized linear mixed models as instead of having correlated outcome data, the repeated measures of the exposures are correlated. There is no general consensus on how the information contained in the longitudinal exposure trajectory can be used in a binary regression model [7]. Consequently, the few studies with this type of data simply examine a series of crosssectional associations as well as an average of repeated measures across pregnancy in standard binary regression analyses [8, 9]. This paper illustrates several different methods that may be used to examine this unique but not uncommon data structure, specifically for epidemiologic studies that examine biomarkers during pregnancy, when fewer measurements may be available because of practical limitations in collecting repeated samples from a large number of expecting mothers. We also specify, for data with specific characteristics, which methods may be more useful for powerfully detecting subtle associations or for capturing relationships between exposure profiles over time in relation to a binary endpoint of interest.
Methods
We applied each method using data from a nested case–control study examining exposure to phthalates during pregnancy and preterm birth. The study population and measurement of phthalate monoester metabolites in urine samples has been described in detail previously [10]. Briefly, mothers were recruited early in pregnancy at Brigham and Women’s Hospital in Boston, MA, as part of a prospective cohort. In addition to demographic information, mothers provided urine samples for phthalate measurement at up to four time points during pregnancy. From this population 130 mothers who delivered preterm and 352 random controls were selected and their urine samples extracted from freezer storage for phthalate analysis. Collection times for urine samples within this study population were median 9.79 for visit 1 (range 4.71 to 16.1), 17.9 for visit 2 (range 14.9 to 21.9), 26.0 for visit 3 (range 22.9 to 29.3) and 35.1 for visit 4 (range 33.1 to 38.3) weeks gestation [11]. The number of phthalate measurements available at each visit was relatively stable for visits 1–3 (visit 1 = 479, visit 2 = 422, visit 3 = 412) but fewer were available at visit 4 (N = 380) as many of the preterm cases had already delivered at that time point [11]. Phthalate metabolites were measured using high performance liquid chromatography and tandem mass spectrometry by NSF International in Ann Arbor, MI [11]. At the time of metabolite measurement, specific gravity was also measured in urine samples as an indicator of urine dilution using a digital handheld refractometer (Atago Co., Ltd., Tokyo, Japan). For the present study, repeated measures of mono2ethylhexyl phthalate (MEHP) as well as mononbutyl phthalate (MBP) were examined as they were both observed to be associated with preterm birth in previously published results [10, 11], but differ in variability of measurements across gestation. Based on intraclass correlation coefficients (ICC), MEHP is less stable over time (ICC = 0.30, 95% confidence interval [CI] = 0.25, 0.35) compared to MBP (ICC = 0.57, 95% CI = 0.53, 0.62) [11]. As with many biomarker measurements, distributions of MEHP and MBP were rightskewed and natural log transformed to fit normality assumptions in statistical models.
We have adjusted for specific gravity and urine dilution as timevarying covariates in our regression models to be consistent with our previously published studies. One can adopt alternative approaches to standardize the urinary phthalate measures, for example, by regressing phthalate levels on these covariates and using the resultant residuals as exposure in the subsequent outcomeexposure model. We conducted a sensitivity analysis to compare our simultaneous adjustment strategy to this twostep strategy and noted that there is no systematic pattern in terms of enhanced significance in one adjustment strategy versus another (data not shown). Thus, for all of the methods presented we opted to include specific gravity as a covariate in the statistical models we fit.
The primary dilemma for examining exposure data with this timevarying structure is how to account for the longitudinal features of the exposure trajectory in a final disease risk model with the binary outcome, conditional on the complete set of exposure measures. This type of problem is somewhat unique, as in most regression settings the independent variable and the dependent variable are either both crosssectional, both longitudinal, or the outcome is longitudinal with a single baseline measure of exposure. A commonly used approach in the realm of studying environmental exposures in relation to preterm birth is to examine multiple crosssectional models, e.g., the relationship between trimester specific exposure levels in association with preterm birth [8, 9, 11, 12]. This approach enhances the burden of multiple hypothesis testing. Another previously employed solution is to include the exposure measures at each time point simultaneously in a multivariate logistic regression model [12], however this too is problematic as these measures are likely to be correlated, leading to inflated standard error estimates and erroneous odds ratio estimates. Additionally, studies with this type of data structure commonly model an average of repeated exposure measures, but doing so wastes all information on temporal variation of the exposure and reduces exposure variability.
In each section of the results, we will examine these as well as six additional methods to elucidate possible modeling strategies that might be useful in different applications. For each method we examine the association between a single binary variable (preterm birth status) and a continuous variable with repeated biomarker measurements (MEHP or MBP) controlling for timeinvariant (maternal age at visit 1, race/ethnicity, health insurance provider, education level, and prepregnancy body mass index [BMI]) as well as timevarying (urinary specific gravity as well as time of day of urine sample collection) covariates. In previous analyses published by this group, different combinations of covariates were included based on their improvement of specific model types for each phthalate metabolite. For consistency and illustration purposes in this paper, all of the above covariates were included for each model presented and this may account for incongruence between these and other previously published results.
For notational convention, we will let Y_{ i } denote the binary outcome for subject i, X_{ ij } denote the continuous measurement corresponding to subject i at visit j (occurring at time t_{ ij }, with time measured in units of gestational days), and Z_{ ij } denote the vector of covariates where i = 1, …, N and j = 1, …, n. For convenience, we also let X_{ i } = (X_{i1}, X_{i2}, …, X_{ in })^{T} and denote all exposure and covariate data available for subject i. Each subject may not have all n measurements and in that case we will let n_{ i } denote the number of exposure measures available per subject and use j = 1, …, n_{ i }. R codes for each method are included in Additional file 1 and are available at http://wwwpersonal.umich.edu/~yinhsiuc/Rcodelongitudinal.html.
Results and discussion
Standard methods
Multiple logistic regression model
A simple way of modeling this association is to regress Y_{ i } on all available X_{ ij } s (for j = 1, …, n) controlling for the mean vector of covariates , that is:
Note that, since this model requires a complete set of predictors X_{ ij }, it will use only the subjects having all exposure data at all n visits, a major limitation of such an approach. Using our dataset, we modeled urinary phthalate metabolite levels measured at four visits in one model predicting preterm birth. MEHP at visit 3 only was significantly associated with increased odds of preterm birth (Table 1). Likewise, MBP at visit 3 was suggestively associated with increased odds of preterm delivery but levels measured at other visits showed no association (Table 1).
There are several potential problems with this analysis. First, collinearity between the repeated of measures of phthalate metabolites may be an issue because it may lead to inflated standard errors and plausibly change the direction of estimates of β_{ j } s [13]. We examined the correlation matrix for both the phthalates by study visit and found relatively high collinearity between visits, indicated by pairwise correlation coefficients ranging from 0.26 to 0.48 for MEHP and from 0.50 to 0.57 for MBP (Table 2) [11]. The inverse associations between preterm birth and MEHP at visit 2 and MBP at visit 1 might indicate unstable estimation resulting from this collinearity between the longitudinal phthalate measures.
A second issue with this method is interpretation of results, as each regression coefficient represents how the phthalate level at a certain visit is associated with preterm delivery status after controlling for the measures at other visits, and it is unrealistic to vary only one of a series of longitudinal measures with other measures fixed. Finally, as mentioned before, this approach requires that X be measured at, at least nearly, a uniform set of time points in order to make β_{ j } s interpretable, as the X s are indexed by visit and not continuous time. If the missing exposure observations are infrequent and the data are almost complete and missing completely at random, approximating an unbalanced data set by a balanced one is subject to limited loss of efficiency. Otherwise, the efficiency loss might be considerable and the approximation may result in serious bias [14]. In this case, as many of the preterm cases had already delivered by visit 4 and there was consequently potential missingness explained by the outcome (missing at random) in phthalate measurements at this time point, the results from this method may be biased.
We also adapted a Bayesian method originally proposed by Warren et al. [15] for spatiotemporal data to handle longitudinal exposure measurements in association with a binary outcome of interest. The Bayesian method includes exposure at each visit (1 to n) simultaneously in the outcome model (e.g., multiple logistic regression) but introduces a specificform of covariance structure for the Gaussian prior on the coefficients . The a priori structure handles the correlation among the effects of exposures at each visit through shrinkage of the regression coefficients, just like Ridge regression, by assuming that correlation between temporally and spatially proximate coefficients is higher. However, the advantage of this method may be quite limited for studies utilizing a small number of repeated maternal biomarker measurements during pregnancy, and may be more practical for applications where a large number of exposure measurements, such as ambient air monitoring data. In our application the Bayesian approach yields results very similar to those from the multiple logistic regression model with slight attenuation towards the null. However, the Bayesian method handles correlated withinsubject exposures across pregnancy in a more principled manner.
This method and the other three standard methods introduced later in this section assume a linear relationship between exposure and response. Adding higherorder polynomial term(s) or replacing the linear term with a smoothing spline term in the framework of Generalized Additive Models (GAM) can accommodate nonlinearity in any of these methods.
Parallel crosssectional logistic regression models
A commonlyused approach to circumvent the collinearity problem from Method Multiple Logistic Regression Model is to fit n separate crosssectional models for each visit as:
In the example data, MEHP and MBP measures at each visit are positively correlated with preterm delivery although none of the effect estimates are statistically significant (Table 3). Odds ratios for preterm delivery range from 1.10 to 1.17 and from 1.15 to 1.34 with an lnunit increase in MEHP and MBP, respectively, after adjustment for covariates. One major drawback of this method is that there is no straightforward way to combine the results from multiple regression models and assess the aggregate effect of X on Y. Also, if desired, controlling for familywise error rate (e.g., using Bonferroni correction) may be conservative because of the varying degree of dependency between the multiple tests. If instead of fitting separate models we jointly estimate β_{ j } s using the generalized estimating equations (GEE) approach described by Sanchez et al. [6], it is possible to circumvent some of these concerns as well as test the differences in associations across visits (i.e., H_{0} : β_{1} = β_{2} = … = β_{ n }). In our example, there are no significant differences in the associations across four study visits for either MEHP (p = 0.95) or MBP (p = 0.81) based on this GEEbased joint estimation method. The odds ratios for preterm delivery based on joint estimation are identical to those in Table 3 with slightly narrower confidence intervals for MEHP and slightly wider confidence intervals for MBP (data not shown).
Model using mean exposure across visits as a summary
The third method regresses the binary variable on the subjectspecific averages of the continuous timevarying exposure variable in a logistic regression framework as follows:
where . This approach is the first attempt to summarize the longitudinal information into one measure and is useful when there is no particular trend in X or the trends over time are similar for subjects with either binary outcome (Y_{ i } = 1 or Y_{ i } = 0, preterm or term). If this is true, adequately captures the longitudinal feature of X to differentiate the two groups. Since phthalate measures are lognormally distributed, subjectspecific geometric rather than arithmetic averages were considered as summary predictors [10]. This method has been utilized previously to examine associations within this dataset but in that analysis visit 4 measures were excluded because of the aforementioned bias in availability of measures at that time point [10]. In the present analysis all available measurements were included in subjectspecific averages, and averages were lntransformed for modeling purposes. The timevarying covariates were also averaged to create subject specific average covariate values ().
The association between average MEHP levels and preterm birth was statistically significant (β = 0.27, standard error [SE] = 0.13, adjusted odds ratio [aOR] = 1.30, p = 0.05, N = 417). MBP average was also suggestively associated with preterm birth (β = 0.24, SE = 0.14, aOR = 1.28, p = 0.08, N = 417). An advantage of this method is that the interpretation of odds ratios is more natural. Odds of preterm delivery was 1.30 times and 1.28 times higher for mothers with an lnunit increase in average urinary MEHP or MBP concentration over the course of pregnancy, respectively.
One difficulty in this method is the treatment of timevarying covariates. For example, to control for time of day of urine sample collection (before vs. after 1 pm), we also used an average of time of day at each study visit. However, the averaged variable is difficult to interpret and may not accurately reflect differences in urinary phthalate metabolite concentrations by time of day. This method may be additionally limited if the data are unbalanced and not missing at random, as mentioned above, or if there are trends in biomarker measures over time that are more relevant to the outcome than the biomarker levels themselves.
Model using maximum exposure value across visits as summary
This method resembles the Model using mean exposure across visits as a summary, except now we regress the binary variable on the maximum, rather than the average, of the continuous variable:
where and is defined as the vector of covariates at the visit of which the continuous variable assumes its maximum value for subject i. In other words, timevarying covariates corresponding to the maximum value of X_{ i } are included in the final model. This approach may be more appropriate when the association between Y and X is not driven by the longitudinal trend of X or an average level but rather an acute or extreme instance of exposure.
Associations using this method were not statistically significant for MEHP (β = 0.07, SE = 0.09, aOR = 0.94, p = 0.46, N = 442) or MBP (β = 0.10, SE = 0.12, aOR = 1.11, p = 0.39, N = 444). This method may be inappropriate for examining associations with phthalates, which are metabolized and excreted quickly but may be measured at very high levels following a recent exposure.
Two stage methods
Two stage mixed effects model
A twostage approach can relax the assumption for the Model using mean exposure across visits as a summary or the Model using maximum exposure value across visits as summary, that the longitudinal trend is irrelevant in terms of detecting the association between X and Y. In this approach, the longitudinally timevarying exposure is first modeled as a function of time (e.g., using random slopes and intercepts) and at the second step best unbiased linear predictor (BLUP) estimates of these random coefficients are used as predictors in a logistic regression model. The formulation for this model is as follows:
where a_{0i} and a_{1i} are the random intercept and random slope jointly distributed as bivariate normal random variables representing the longitudinal trend of X, α_{0} and α_{1} are corresponding fixed effects, and ϵ_{ ij } is the error term distributed independently of the random effects. In this method the subjectspecific time trends of X from Stage 1 are extracted and modeled in the second stage along with relevant covariates. This method is appropriate if it seems plausible that the predicted subjectspecific intercepts and slopes for X provide an accurate summary characterization of distinct patterns across the two outcome groups. If the exposure variable X is not normally distributed, the joint normality assumption for the random intercept and random slope may be inappropriate. As mentioned in the Methods section, we naturallog transformed MEHP and MBP prior to analysis in order to better approximate the normality assumption. If needed, more sophisticated methods for treating specific exposure distributions have been suggested by ArellanoValle et al. (skewnormal linear mixed model) [16] and Zhang et al. (loggamma linear mixed model) [17].
Subjectspecific slopes and/or intercepts from Stage 1 can be simultaneously included as continuous predictors in the Stage 2 logistic regression model. Alternatively, one can cluster fitted intercepts and slopes from Stage 1 on a 2dimensional Euclidean space and model the resulting clustering index (â_{0i}, â_{1i} ) in the Stage 2 logistic regression model. This second method effectively groups subjects based on similarities in trends in exposure levels over pregnancy and may have greater power over modeling BLUP estimates. Additionally, if desired, this method can flexibly accommodate a nonlinear trend in Stage 1 with higherorder terms, such as quadratics (e.g., â_{2i} or α_{2}) or curvature characteristics (e.g., 2(â_{2} + â_{2i}) + (â_{1} + â_{1i})) as predictors of the binary outcome in Stage 2. Finally, if the relationship between exposure and response is found to be nonlinear, one can include a smooth functional representation of the summarized characteristics from Stage 1 in Stage 2 to account for the nonlinearity.
In the example analysis, we modeled subjectspecific slopes and intercepts extracted from Stage 1 continuously in Stage 2. Using this method, MEHP exhibits evidence suggestive association (β_{1}=0.35, SE = 0.20, aOR = 1.43, p = 0.08, N = 417) between the subjectspecific predicted intercepts and preterm delivery, but the effect estimate for predicted slope was not statistically significant. This result suggests that the mean MEHP exposure level, as indicated by the subjectspecific predicted intercepts, was associated with preterm birth, but trends in levels across pregnancy were not contributors to the outcome. With respect to MBP, the predicted slopes and predicted intercepts were highly correlated in the Stage 1 model suggesting that the additional inclusion of random slope for gestational age at sample collection was not necessary. We thus refit the model without random slope and used the random intercept only as a predictor in the Stage 2 model. An lnunit increase in the predicted intercept for MBP was associated with an increased odds of preterm birth (β_{1} =0.32, SE = 0.17, aOR = 1.37, p = 0.07, N = 417), which is again similar to the interpretation from the mean model.
We also examined the effect of dividing subjects into two clusters based on a kmeans clustering with subjectspecific predicted intercepts and slopes plotted on the twodimensional space described above. This was performed only for MEHP, as subjectspecific slopes were not important for predicting MBP concentrations. For MEHP, the clearest separation occurred based on the predicted intercept, and the predicted slope term was less important (Figure 1). In the Stage 2 model the clustering index for MEHP was marginally associated with preterm delivery (β_{1} =0.43, SE = 0.26, aOR = 1.54, p = 0.09, N = 417). The individuals classified in the “greater predicted intercept” group (red dots) had 1.54 times the odds of having a preterm delivery compared to the “smaller predicted intercept” group (black dots).
A major drawback of this model is that BLUP estimates or clustering uncertainty from the Stage 1 model is not accounted for in the Stage 2 analysis, which may lead to biased results. A Bayesian analysis or an analysis based on a joint likelihood of the longitudinal model and the final logistic model (as done in [7]) can ensure proper propagation of uncertainty. The functional methods discussed in Methods based on clustering section involve joint estimation, but suffer from having sparse measurements over time.
Generalized additive mixed model to contrast exposure trajectories
In all of the previous methods, we treated X as the independent variable and Y as the dependent variable in the regression framework. However, if the focus is not to establish causality, or characterize risk of preterm birth, a reverse temporal model that treats X as a longitudinal dependent variable and Y as a timeinvariant independent binary variable can be used to contrast exposure trajectories in the two groups. To that end, we use the following generalized additive mixed model (GAMM):
where b_{0i} is the random intercept and f_{1}(.) and f_{2}(.) are smooth functions, represented by a natural spline in this example. Time is equivalent to gestational age at urine sample collection. This model naturally accounts for the longitudinal nature of X and the trend curves can be depicted parametrically or nonparametrically. While the model is not interpretable in terms of temporality, since the occurrence of X precedes Y, the results may provide information about the differences in X for each level of Y over the time course of pregnancy.
The fully flexible saturated model allows a separate smooth curve for each level of Y, which is equivalent to allowing an unconstrained interaction term between Y and gestational age. In the present analysis we started with these freely fitted smooth curves (MEHP or MBP predicted by gestational age at sample collection) in mothers who delivered preterm compared to mothers who delivered term. This has been illustrated previously in this dataset [11] and is replicated in Figure 2. The estimated degrees of freedom (EDF) for the difference between the two curves was 2 for both MEHP and MBP models, indicating a linear difference in the two groups across gestation. Additionally, the slope of the linear difference in the two groups was not significantly different from zero (type one error rate = 0.05), further confirming a constant difference in MEHP or MBP levels in the two groups. There was a nearly significant interaction between gestational age at sample collection and preterm birth for both MEHP (p = 0.09) and MBP (p = 0.07), which parallels the significant differences in cases and controls observed in previous methods. However, to fully characterize differences using the exposure trajectories from this method a larger sample size and/or additional study visits would be necessary.
This method accounts for potential nonlinearity in the relationship between exposure and gestational age. However, there is no clear way to examine the nonlinear relationship between exposure and preterm birth status in this model. A possible alternative is to regress preterm birth status on a tensor product smoother of gestational age and exposure [18]. However, the difficulty of interpreting these results and the sparseness of our data limit the applicability of this alternative in this situation.
Methods based on clustering
Gaussian mixture model by clustering the exposure values
This method treats the longitudinal measures in X_{ i } = (X_{i1}, X_{i2},…, X_{ in })^{T} as a vector and distinguishes the subjects with Y_{ i } = 1 vs. Y_{ i } = 0 by characterizing the X_{ i } vectors of each group. If the dataset is almost balanced in a sense that every subject has the same number of observations for X with minimal amount of missingness, we can assume that each X_{ i } follows a multivariate normal distribution with an unknown mean and variance. In this method, we hypothesize that there are K latent multivariate normal distributions with different means and variances from which X_{ i } s are drawn [19]. Subjects with X_{ i } s from some of the latent distributions will have a higher probability of being Y_{ i } = 1 compared to subjects with other latent distributions for X_{ i }. A twostage procedure can then be devised to identify clusters of X_{ i } that have an increased probability of Y_{ i } = 1. In the first stage, subjectspecific exposure vectors, X_{ i } s, are clustered based on Gaussian distributions with a predetermined number of clusters (K). Each subject is grouped into one of the K clusters. In the second stage, we then regress Y_{ i } on this clustering index along with the covariates as:
where C_{ i } is the clustering index, which may be more than onedimensional when K > 2. A limitation to this method is that X has to be balanced (n_{ i } = n). Thus, in the present analysis, the dataset had to be restricted to subjects with phthalate measurements available at all four study visits (N = 280), which may bias results as many fewer cases had observations available at visit 4 (median gestational age = 35 weeks). Nevertheless, we performed the analysis to illustrate the application of this method. An additional limitation of this method is that n must be relatively small compared to the number of subjects, N. If n is too large, there may be more parameters in the model than there are subjects, and the covariance matrices would have to be restricted [19]. In this dataset n (4) was sufficiently smaller than N (280). As with other applications we naturallog transformed MEHP and MBP to better approximate the normality assumptions required for this method. As an alternative, nonGaussian clustering methods may be performed as suggested by Banfield and Raftery [20].
We selected the optimal number of clusters according to Bayesian information criterion (BIC) approximation for modelbased clustering [21]. The number of clusters chosen for MEHP and MBP were three and two, respectively. The means of the phthalate metabolite concentrations within each cluster by study visit are displayed in Figure 3. In Stage 2 we modeled each clustering index in relation to odds of having a preterm delivery. For MEHP, the odds of having a preterm birth for subjects in cluster 2 was 1.84 times (β_{1} =0.61, SE = 0.65, pvalue = 0.35) the odds for subjects in cluster 1; subjects in cluster 3 had 2.09 times (β_{2} =0.74, SE = 0.55, pvalue = 0.18) the odds of having a preterm birth compared to subjects in cluster 1. For MBP, subjects in cluster 2 had 5.44 times the odds of having a preterm birth (β_{1} =1.69, SE = 0.87, p = 0.05) compared to subjects in cluster 1. However, for MBP, 274 subjects fall into cluster 1 and only 6 fall into cluster 2, making these results unreliable. The imprecision of the MEHP estimates and the instability in the Stage 1 clustering for MBP may stem from the small sample size of the complete dataset. Additionally, as suggested by the results from the method using Generalized additive mixed model to contrast exposure trajectories, the trajectories characterized by vectors in this method may not be meaningful for phthalates or identifiable with only 4 exposure measurements.
Functional clustering model
Whereas the previous method entailed clustering based on Gaussian distributions of X, here we employ kmeans clustering based on nonparametric distributions. This method utilizes principal points which summarize important features of the relationship between X_{ ij } and t_{ ij }, or, in this example, the relationship between urinary phthalate metabolite concentration and gestational age at sample collection [22, 23]. The principal points, which can be based on either raw curves or derivatives, effectively reduce the functional data to multivariate data, and enable subsequent kmeans clustering. The advantage of functional clustering compared to Gaussian mixture clustering is that it does not require X to be balanced. Thus, this method is preferable for the present example compared to Method (Gaussian mixture model by clustering the exposure values).
This twostage procedure can be conducted in two ways. The first is to cluster the curves of X_{ ij } against t_{ ij } first, and then to regress the binary outcome Y_{ i } on the clustering index along with covariates in the second stage. A drawback to this method is that clustering of the curves may be dependent on one of the covariates, e.g., the race/ethnicity of the subject. An alternative approach is to first regress X_{ ij } on all covariates of interest, then to cluster curves of the residuals vs. t_{ ij } and use those clusters in a logistic regression model. For either variation of this method, the dependency between Y_{ i } and the clustering index can be tested with a chisquare test (with degrees of freedom = K  1).
In the present dataset we fixed the number of clusters to two for this method. In order to determine whether the functional clustering was dependent on covariate distributions only, rather than phthalate exposure, we conducted twosample ttests and chisquare tests for the continuous and categorical variables, respectively, with grouping based on the clustering results (data not shown). For MEHP, clustering was strongly associated with maternal race/ethnicity, BMI, and urinary specific gravity. For MBP, most covariates were significantly associated with the clustering index. Thus, in this example, functional clustering based on a model of t_{ ij } vs. the residuals from the model of X_{ ij } regressed on all covariates was more appropriate. Figure 4 displays the mean smooth curves of phthalate metabolite residuals vs. gestational age at sample collection for the two clusters identified. Subjects in cluster 1 had an elevated odds of having a preterm birth compared to subjects in cluster 2 for both MEHP (aOR = 1.60, p = 0.03) and MBP (aOR = 1.27, p = 0.36).
Functional logistic regression model
In this method, we assume that the binary outcome depends on the longitudinal predictor variable along with the covariates of interest through the functional logistic regression model proposed by Muller et al. [24] as follows:
where β(t) is the timevarying coefficient for the longitudinal predictor variable. The interpretation is that the log odds of Y_{ i } = 1 would increase by amount from time point t_{1} to time point t_{2} with a unit increase in the predictor X_{ i }(t) across the time interval t_{1} to t_{2}. In the context of this example, the odds of being born preterm would be dependent on aggregate phthalate concentrations across gestation periods (e.g., from Visit 1 to Visit 4).
This method utilizes functional principal component (FPC) scores to summarize the important features of functional curves of exposure levels over time and include the scores in the regression model. The FPC scores can be estimated via the conditional expectation method proposed by Yao et al. [25]. The number of basis functions used to construct the functional curves and the number of leading FPC scores included (L) can be determined by the leaveonecurveout crossvalidation score suggested by Peng and Paul [26]. A global test of H_{ o } : β_{1} = β_{2} = … = β_{ L } = 0 such as L degreeoffreedom likelihood ratio test (LRT) can be conducted to investigate the global association between longitudinal X and the stationary binary outcome Y_{ i }.
The number of basis functions and the number of leading FPC scores selected by modified BIC are nine and two, respectively, for both MEHP and MBP. The 2df LRT statistic for testing if there is an overall association between MEHP and preterm delivery was 3.48 (p = 0.18). For MBP, the 2df LRT statistic was 4.81 (p = 0.09). The inclusion of the estimated FPC scores does not significantly improve the proportion of variance explained by the model for either phthalate metabolite. In other words, there is not sufficiently strong evidence for an association between MEHP or MBP measures summarized by FPC analysis and preterm delivery. This may mean that: a) the changes in phthalate levels over pregnancy are not associated with preterm birth; or b) that there is not sufficient data (sample size or number of observations) to detect associations with phthalate change over pregnancy and preterm birth in this dataset. For other exposures or biomarkers that have more substantial increases or decreases across gestation or for a study with larger sample size and/or additional repeated measures this method may provide more useful information.
The clustering strategies underlying each method in the Methods based on clustering section, based on the longitudinal measures of the exposure in the original scale, assumes a linear relationship, albeit indirectly, between exposure and response. If desired, one can perform clustering based on the transformed exposure levels, such as X^{2} and e^{X}, in order to account for relationships that are nonlinear in nature.
Conclusions
In the context of examining repeated maternal biomarkers during pregnancy in association with preterm birth, we described 9 statistical methods that may be useful for fully utilizing longitudinal characteristics of data in future studies. Each has strengths and limitations and may be suitable in some circumstances but not others based on both characteristics of the predictor as well as goal of the analysis (summarized in Table 4). For identifying windows of vulnerability, examining information from each study visit simultaneous or separately (Multiple logistic regression model and Parallel crosssectional logistic regression models) or modeling individual (Two stage mixed effects model) or population level (Generalized additive mixed model to contrast exposure trajectories) patterns of exposure in relation to the health outcome may be most useful. Sanchez et al. previously contrasted utility of these approaches for identifying particularly important windows of exposure [6]. This casestudy alternatively presents methods for leveraging multiple exposure measurements over time to more powerfully detect a true association with a nontimevarying outcome. Such a relationship may arise from: 1) Exposure during a particularly sensitive window of pregnancy; 2) Generally elevated levels of exposure across gestation; 3) An acute exposure; or 4) A change in exposure over time, e.g., an increasing trend over the course of weeks or months.
The preponderance of the research on environmental exposures during pregnancy and preterm birth examine an association with generally elevated levels of exposure, and utilize one metric during gestation. The Model using mean exposure across visits as a summary and random intercepts from the Two stage mixed effects model use repeated measures that can more powerfully detect such an effect than any single measurement model, e.g. the Multiple logistic regression model and the Parallel crosssectional logistic regression models, regardless of the number of exposure measurements available. Notably, employing the two stage model from the Two stage mixed effects method did not improve results obtained from simply taking an average of exposures in this example, although the Two stage mixed effects model accounts for timevarying covariates in a more sensible manner.
Fewer studies have expressed interest in identifying whether an acute exposure at any time point or during sensitive period is associated with preterm birth. To some extent the methods for investigating windows of vulnerability address this question, as do some studies utilizing ambient air pollution measures and survival analyses [27, 28]. Additionally, the Maximum Model (Model using maximum exposure value across visits as summary) may serve to this end, and the functional logistic regression approach (Functional logistic regression model) has been used recently to identify windows of susceptibility for long term trajectories of exposure with rich repeated measurements, and for studying how genetic factors may modify these windows [29]. These approaches may not be useful for urinary phthalate metabolites, or other biomarkers that are highly variable over time, as high concentrations may be indicative of temporally recent rather than acute exposure.
The contribution of temporal changes in environmental exposures to preterm birth is relatively understudied, particularly in research utilizing biomarkers of exposure measurement during pregnancy. However, it is plausible that these patterns may contribute to prematurity and other adverse birth outcomes when steady exposures do not. For airpollution studies or ambient monitoring data measured at a finer time scale where personalized measures are not needed, this temporal feature can and has been studied in greater detail, for example, as presented in Warren et al. [15]. The Two stage mixed effects model, when incorporating random slopes, and the Gaussian mixture model by clustering the exposure values, Functional clustering model, and Functional logistic regression model, are different ways of capturing the additional information that repeated measures over time provide that may contribute more than pure exposure measure contributions to preterm birth. Additionally, though not capable of quantifying an effect, the Generalized additive mixed model to contrast exposure trajectories can characterize patterns and establish differences in preterm vs. term groups. These methods were not useful for further understanding the relationship between phthalate exposure and preterm birth, potentially due to few exposure measurements, the instability of urine levels, or because phthalates are not chemicals whose patterns over pregnancy contribute to this outcome. These methods may be more useful for datasets with higher frequency of exposure assessment, with a more temporally resolved structure. Collection of a dense set of measurements may not be practically feasible for studies analyzing expensive biomarkers during a short time window (40 weeks gestation) in a large prospective cohort study. However, they may be applicable in studies utilizing less costly exposure metrics, such as ambient air monitoring or using simpler, less expensive and noninvasive techniques for exposure assessment. In addition to requiring this robust data structure, the temporal methods have other limitations. The Two stage mixed effects model, Gaussian mixture model by clustering the exposure values, and Functional clustering model ignore uncertainty in the first step estimation or clustering and thus underestimate the standard error in the final odds ratio, increasing the likelihood of a false positive in the results. The Functional logistic regression model is more desirable because it does not require these two steps and carries out inference based on a joint likelihood. Another limitation is that these methods do not appropriately account for dropout which is inherent in a longitudinal study. In this casestudy, missingness at visit 4 is likely related to the outcome of interest, preterm birth, leading to a missing at random mechanism [30]. In addition, missingness could also be related to other unmeasured covariates. We recommend that sensitivity analyses with respect to the parameters of the dropout probability model be performed whenever this may be the case. If a probability model for dropout can be validly constructed, one can leverage an inverse probability weighting approach. Despite these limitations, the temporal methods described may provide new insight into the study of environmental exposures and prematurity.
In conclusion, the methods exemplified in this case study may be of great use in future epidemiologic research projects intended to: 1) Elucidate the complex relationships between environmental chemical exposures and preterm birth; 2) Investigate biological mechanisms in prematurity using repeated measures of maternal factors throughout pregnancy; and 3) More generally, address the relationship between a longitudinal predictor and a binary, nontimevarying outcome.
Abbreviations
 MEHP:

Mono2ethylhexyl phthalate
 MBP:

Mononbutyl phthalate
 ICC:

Intraclass correlation coefficient
 CI:

Confidence interval
 BMI:

Body mass index
 GEE:

Generalized estimating equations
 SE:

Standard error
 aOR:

Adjusted odds ratio
 BLUP:

Best linear unbiased predictor
 GAMM:

Generalized additive mixed model
 EDF:

Estimated degrees of freedom
 BIC:

Bayesian information criterion
 FPC:

Functional principal component
 LRT:

Likelihood ratio test.
References
 1.
Behrman RE, Butler AS: Preterm Birth: Causes, Consequences, And Prevention. Washington, DC: The National Academies Press; 2007.
 2.
Bastek JA, Elovitz MA: The role and challenges of biomarkers in spontaneous preterm birth and preeclampsia.Fertil Steril 2013, 99:1117–23. 10.1016/j.fertnstert.2013.01.104
 3.
Genc MR, Ford CE: The clinical use of inflammatory markers during pregnancy.Curr Opin Obstet Gynecol 2010, 22:116–21. 10.1097/GCO.0b013e3283374ac8
 4.
Menon R, Torloni MR, Voltolini C, Torricelli M, Merialdi M, Betran AP, et al.: Biomarkers of spontaneous preterm birth: an overview of the literature in the last four decades.Reprod Sci 2011, 18:1046–70. 10.1177/1933719111415548
 5.
Ferguson KK, O'Neill MS, Meeker JD: Environmental contaminant exposures and preterm birth: a comprehensive review.J Toxicol Environ Health B Crit Rev 2013, 16:69–113. 10.1080/10937404.2013.775048
 6.
Sanchez BN, Hu H, Litman HJ, TellezRojo MM: Statistical methods to study timing of vulnerability with sparsely sampled data on environmental toxicants.Environ Health Perspect 2011, 119:409–15.
 7.
Bhadra D, Daniels MJ, Kim S, Ghosh M, Mukherjee B: A Bayesian semiparametric approach for incorporating longitudinal information on exposure history for inference in case–control studies.Biometrics 2012, 68:361–70. 10.1111/j.15410420.2011.01686.x
 8.
Patelarou E, Kargaki S, Stephanou EG, Nieuwenhuijsen M, Sourtzi P, Gracia E, et al.: Exposure to brominated trihalomethanes in drinking water and reproductive outcomes.Occup Environ Med 2011, 68:438–45. 10.1136/oem.2010.056150
 9.
Wilhelm M, Ghosh JK, Su J, Cockburn M, Jerrett M, Ritz B: Trafficrelated air toxics and preterm birth: a populationbased case–control study in Los Angeles county, California.Environ Health 2011, 10:89. 10.1186/1476069X1089
 10.
Ferguson KK, McElrath TF, Meeker JD: Environmental phthalate exposure and preterm birth.JAMA Pediatr 2014, 168:61–7. 10.1001/jamapediatrics.2013.3699
 11.
Ferguson KK, McElrath TF, Ko YA, Mukherjee B, Meeker JD: Variability in urinary phthalate metabolite levels across pregnancy and sensitive windows of exposure for the risk of preterm birth.Environ Int 2014, 70:118–24.
 12.
Cantonwine D, Hu H, Sánchez BN, LamadridFigueroa H, Smith D, Ettinger AS, et al.: Critical windows of fetal lead exposure: adverse impacts on length of gestation and risk of premature delivery.J Occup Environ Med 2010, 52:1106. 10.1097/JOM.0b013e3181f86fee
 13.
Farrar DE, Glauber RR: Multicollinearity in regression analysis: the problem revisited.Rev Econ Stat 1967, 92–107.
 14.
Baltagi B: Econometric Analysis of Panel Data. John Wiley & Sons; 2008.
 15.
Warren J, Fuentes M, Herring A, Langlois P: Spatialtemporal modeling of the association between air pollution exposure and preterm birth: identifying critical windows of exposure.Biometrics 2012, 68:1157–67. 10.1111/j.15410420.2012.01774.x
 16.
ArellanoValle R, Bolfarine H, Lachos V: Skewnormal linear mixed models.J Data Sci 2005, 3:415–38.
 17.
Zhang P, Song PXK, Qu A, Greene T: Efficient estimation for patientspecific rates of disease progression using nonnormal linear mixed models.Biometrics 2008, 64:29–38. 10.1111/j.15410420.2007.00824.x
 18.
Wood SN: Lowrank scaleinvariant tensor product smooths for generalized additive mixed models.Biometrics 2006, 62:1025–36. 10.1111/j.15410420.2006.00574.x
 19.
Fraley C, Raftery AE: Modelbased clustering, discriminant analysis, and density estimation.J Am Stat Assoc 2002, 97:611–31. 10.1198/016214502760047131
 20.
Banfield JD, Raftery AE: Modelbased Gaussian and nonGaussian clustering.Biometrics 1993, 803–21.
 21.
Schwarz G: Estimating the dimension of a model.Ann Stat 1978, 6:461–4. 10.1214/aos/1176344136
 22.
Flury BA: Principal points.Biometrika 1990, 77:33–41. 10.1093/biomet/77.1.33
 23.
Tarpey T, Kinateder KK: Clustering functional data.J Classification 2003, 20:093–114. 10.1007/s0035700300073
 24.
MÜLLER HG: Functional modelling and classification of longitudinal data*.Scand J Stat 2005, 32:223–40. 10.1111/j.14679469.2005.00429.x
 25.
Yao F, Müller HG, Wang JL: Functional data analysis for sparse longitudinal data.J Am Stat Assoc 2005, 100:577–90. 10.1198/016214504000001745
 26.
Peng J, Paul D: A geometric approach to maximum likelihood estimation of the functional principal components from sparse longitudinal data.J Comput Graph Stat 2009, 18:995–1015. 10.1198/jcgs.2009.08011
 27.
Auger N, Naimi AI, Smargiassi A, Lo E, Kosatsky T: Extreme heat and risk of early delivery among preterm and term pregnancies.Epidemiology 2014, 25:344–50. 10.1097/EDE.0000000000000074
 28.
Suh YJ, Kim H, Seo JH, Park H, Kim YJ, Hong YC, et al.: Different effects of PM10 exposure on preterm birth by gestational period estimated from timedependent survival analyses.Int Arch Occup Environ Health 2009, 82:613–21. 10.1007/s0042000803807
 29.
Wei P, Tang H, Li D: Functional logistic regression approach to detecting gene by longitudinal environmental exposure interaction in a casecontrol study.Genet Epidemiol 2014, 38:638–51. 10.1002/gepi.21852
 30.
Little RJ, Rubin DB: Statistical Analysis With Missing Data. Second edition. John Wiley & Sons; 2002.
Acknowledgements
This work was supported by grants from the National Institute of Environmental Health Sciences, National Institutes of Health Grant R21ES020811, R01ES018872, P42ES017198, P01ES022844, and P30ES017885 and National Science Foundation Grant NSF 1406712.
Author information
Additional information
Competing interests
The authors declare that they have no competing interests.
Authors’ contributions
YC and KKF contributed equally to the statistical analysis and drafting of the manuscript. JDM designed and coordinated the study and helped to draft the manuscript. TFM designed and coordinated the study. BM directed the statistical analysis and helped to draft the manuscript. All authors read and approved the final manuscript.
Electronic supplementary material
Rights and permissions
About this article
Cite this article
Chen, Y., Ferguson, K.K., Meeker, J.D. et al. Statistical methods for modeling repeated measures of maternal environmental exposure biomarkers during pregnancy in association with preterm birth. Environ Health 14, 9 (2015). https://doi.org/10.1186/1476069X149
Received:
Accepted:
Published:
Keywords
 Prematurity
 Environment
 Statistical methods
 Biomarkers
 Repeated measures
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate. Please note that comments may be removed without notice if they are flagged by another user or do not comply with our community guidelines.