 Methodology
 Open Access
 Published:
Construction of environmental risk score beyond standard linear models using machine learning methods: application to metal mixtures, oxidative stress and cardiovascular disease in NHANES
Environmental Health volume 16, Article number: 102 (2017)
Abstract
Background
There is growing concern of health effects of exposure to pollutant mixtures. We initially proposed an Environmental Risk Score (ERS) as a summary measure to examine the risk of exposure to multipollutants in epidemiologic research considering only pollutant main effects. We expand the ERS by consideration of pollutantpollutant interactions using modern machine learning methods. We illustrate the multipollutant approaches to predicting a marker of oxidative stress (gammaglutamyl transferase (GGT)), a common disease pathway linking environmental exposure and numerous health endpoints.
Methods
We examined 20 metal biomarkers measured in urine or whole blood from 6 cycles of the National Health and Nutrition Examination Survey (NHANES 2003–2004 to 2013–2014, n = 9664). We randomly split the data evenly into training and testing sets and constructed ERS’s of metal mixtures for GGT using adaptive elasticnet with main effects and pairwise interactions (AENETI), Bayesian additive regression tree (BART), Bayesian kernel machine regression (BKMR), and Super Learner in the training set and evaluated their performances in the testing set. We also evaluated the associations between GGTERS and cardiovascular endpoints.
Results
ERS based on AENETI performed better than other approaches in terms of prediction errors in the testing set. Important metals identified in relation to GGT include cadmium (urine), dimethylarsonic acid, monomethylarsonic acid, cobalt, and barium. All ERS’s showed significant associations with systolic and diastolic blood pressure and hypertension. For hypertension, one SD increase in each ERS from AENETI, BART and SuperLearner were associated with odds ratios of 1.26 (95% CI, 1.15, 1.38), 1.17 (1.09, 1.25), and 1.30 (1.20, 1.40), respectively. ERS’s showed nonsignificant positive associations with mortality outcomes.
Conclusions
ERS is a useful tool for characterizing cumulative risk from pollutant mixtures, with accounting for statistical challenges such as high degrees of correlations and pollutantpollutant interactions. ERS constructed for an intermediate marker like GGT is predictive of related disease endpoints.
Background
Over the last several decades, extensive research on health effects of environmental pollutant exposures has advanced our understanding in pollutant toxicities and related biological mechanisms. These led us to control and lower national regulatory standards for some pollutants (e.g., blood lead levels in children), which has reduced the burden of disease and prevented substantial environmental exposure related diseases. Despite these achievements, a huge research gap remains unanswered: What is the health effect of exposure to pollutant mixtures? Although there is growing concern of potential health effects of exposure to pollutant mixtures, most previous studies have been limited to single pollutants, i.e., the unit of analysis is based on a single pollutant. This is due to statistical challenges, such as high degrees of correlation between pollutants, confounding due to copollutants (i.e., a spurious association in a single pollutant approach may be observed if the single pollutant is a proxy for other copollutants or a mixture of pollutants), lack of replication cohorts, and lack of statistical approaches to evaluating pollutant mixtures [1]. Other methodologic challenges, such as difficulty in exposure assessment of pollutant mixtures with regards to accuracy and feasibility and measurement errors, are also important to hinder multipollutant approaches but the present study will focus on statistical challenges.
Recently, several methods have been proposed to explore health effects of multiple pollutants [2,3,4]. These include variable selection approaches (least absolute shrinkage and selection operator (LASSO) [5], elasticnet [6], adaptive elasticnet [7]); dimension reduction techniques (principal component analysis (PCA), partial least squares (PLS) [8], weighted quantile sum (WQS) regression [9]); Bayesian approaches (Bayesian model averaging (BMA) [10], Bayesian kernel machine regression (BKMR) [11]); and recursive partitioning (classification and regression tree (CART) [12], random forest [13], Bayesian additive regression tree (BART) [14]). A few studies implemented these methods to analyze multiple pollutants [3, 11, 15,16,17,18]. A recent workshop organized by the National Institute of Environmental Health and Sciences (NIEHS) suggested that there is a need for development of novel statistical approaches and appropriate use of available statistical methods in the analysis of combined exposure data from epidemiologic studies [19]. It is also important that methods accounting for the interactions between pollutants within a mixture and cumulative pollutant exposure are needed to estimate the risk of disease [1].
To address these challenges and research gaps, we propose an updated ‘Environmental Risk Score (ERS)’. We initially proposed an ERS as a potential summary measure of effects of multiple pollutants in epidemiologic research [20]. The underlying idea behind an ERS is to build a predictive risk model as a weighted sum of the pollutant levels from simultaneous assessment of multiple pollutants [20]. Weights are determined by the magnitudes (standardized regression coefficients) of the association between each pollutant and the outcome of interest. Our initial study of ERS has several limitations in that quantification of health effects was based on the main effects of individual pollutants; pollutantpollutant interactions were not considered; and the computed ERS is conceptualized and created in a diseasespecific way (i.e., an ERS for one disease is not applicable for another disease) [20]. The updated ERS we propose here will account for these two limitations. The use of modern machine learning methods we consider also bypass the limitation of working with weighted parameters to derive the ERS and offer the flexibility of working directly with predictions that can be generated by a wide class of tools (for example, treebased methods, smoothing and selection methods) beyond the traditional regression model.
With this objective, we examined the association between metal mixtures and cardiovascular health outcomes in U.S. adults. Heavy metals are widespread, and notable for their toxic effects even at low levels of exposure encountered in the general environment. The cardiovascular system is a main target organ for various heavy metals [21, 22]. Common biological mechanisms by which heavy metals affect the cardiovascular system include increased oxidative stress and lipid peroxidation, inhibition of antioxidant defense systems, smooth muscle cell proliferation, and endothelial injury and apoptosis [22, 23]. Although cardiovascular effects of individual metals have been extensively studied, little is known about cumulative effects of metal mixtures. To test this question, we examined 20 metal biomarkers measured in urine or whole blood from 6 cycles of the National Health and Nutrition Examination Survey (from NHANES 2003–2004 to 2013–2014). We first constructed ERS of metal mixtures for gammaglutamyl transferase (GGT), a marker of oxidative stress [24, 25], because oxidative stress is a common disease pathway linking environmental pollutant exposure and numerous health endpoints. We then evaluated the associations between ERS and cardiovascular endpoints (blood pressure, hypertension and total and cardiovascular disease (CVD) mortality). Therefore, the constructed ERS will reflect cumulative risk of oxidative stress due to metal mixtures and can also be applied to other health outcomes relevant to oxidative stress, such as cancer, type2 diabetes, etc. To identify important metals and potential metalmetal interactions, we employed adaptive elasticnet [7], BART [14] and BKMR [11]. We also used Super Learner, an ensemble machine learning prediction algorithm that utilizes a weighted combination of many candidate learners [26]. We compared the prediction performance among these different statistical approaches in a test set to provide insights with regards to data complexity (e.g., sample size, number of predictors, nonlinearity, high degrees of correlation, and potential pollutantpollutant interactions).
Methods
Study population
Data were from the U.S. NHANES cycles from 2003 to 2004 through 2013–14 (six continuous NHANES cycles). Two earlier cycles of NHANES (1999–2000 and 2001–2002) were not included because arsenic species were not assessed in those cycles. NHANES is a crosssectional study designed to be representative of the health and diet of the noninstitutionalized U.S. population. NHANES employs a complex, multistage sampling design with accompanying design weights. For the present study, 10,805 adults aged 20 years or older who participated in the substudy of heavy metals in urine, were eligible. We excluded 1141 participants who had missing data in the outcome (GGT, n = 663), exposures (heavy metals, n = 862) and core covariates (smoking, education, body mass index (BMI), n = 197) (the numbers of missing in parentheses are not mutually exclusive), resulting in the sample size of 9664 for ERS construction. Those excluded were older and more likely to be female and nonHispanic black and less educated (Additional file 1: Table S1). Any potential confounding bias by these factors was minimized by adjustment as covariates in regression models. Given that participants did not know their exposure levels at the recruitment, selection bias due to differential missing data in exposure is unlikely. We additionally excluded 464 and 489 participants when examining blood pressure and hypertension, respectively, due to missing in information on blood pressure and hypertension. For mortality, 6404 participants were available for analysis because mortality data were not available in the last two cycles. NHANES is a publicly available data set and all participants in NHANES provide written informed consent, consistent with approval by the National Center for Health Statistics Institutional Review Board.
Assessment for heavy metals
Heavy metals in urine (antimony, total arsenic, arsenous acid, arsenic acid, arsenobetaine, arsenocholine, monomethylarsonic acid, dimethylarsonic acid, barium, cadmium, cobalt, cesium, molybdenum, lead, thallium, tungsten, uranium: n = 17) and whole blood (lead, cadmium, total mercury: n = 3) were analyzed with inductively coupledplasma dynamic reaction cellmass spectrometry (ICPDRCMS). Detailed laboratory methods and quality control/quality assurance data available in the NHANES website (for example, the laboratory procedure manual for heavy metals in urine in NHANES 2003–2004 available at https://wwwn.cdc.gov/nchs/data/nhanes/20032004/labmethods/l06_c_met_hm.pdf). Platinum and beryllium in urine were excluded because more than 95% observations were below limits of detection (LODs). In NHANES, concentrations below LOD were imputed with a value equal to LOD/√2. Although there is some concern in this commonly used substitution method [27], we did not conduct other alternative methods to avoid too complicated modeling and because statistical approaches for mixtures do not support such modeling.
Assessment for outcomes and covariates
Serum GGT was assayed with Beckman Synchron LX20 or Beckman UniCel DxC800 Synchron via an enzymatic rate method. Blood pressure was measured up to four consecutive times by certified examiners with standardized protocols. We calculated means of systolic and diastolic blood pressure (SBP/DBP) by averaging up to three measures after disregarding the first reading. Hypertension was defined as SBP/DBP ≥ 140/90 mmHg, selfreported physician diagnosis of hypertension, or selfreported use of hypertension medication. Publicuse linked mortality data were available for NHANES through 2009–2010. These data provides mortality followup data from the date of survey participation through December 31, 2011. Therefore, only the first 4 cycles from 2003 to 2004 to 2009–2010 were used for mortality. Total mortality and causespecific mortalities (CVD (the 10th International Classification of Disease (ICD10): I00I09, I11, I13, I2051) and cancer (C00C97)) were linked to NHANES participants. Important covariates were chosen a priori and included age, sex, race/ethnicity (Mexican American, Other Hispanic, nonHispanic white, nonHispanic black, Other), education (<high school diploma, high school diploma, and >high school diploma), smoking status (never, former, current), BMI, and urinary creatinine. We selected education as an indicator of socioeconomic status because it is widely used and has less missing data than other proxies, such as household income or poverty income ratio. Urinary creatinine, an indicator of urine dilution [28, 29], was measured using either Jaffe reaction or an enzymatic method.
Statistical analysis
Although we used complex sampled NHANES data, we did not consider survey components (sampling weights, clusters, and strata) when constructing ERS because most statistical packages for implementing the ML methods do not allow such survey components. We used the survey package in R (version 3.3.1) in the final analyses of associating ERS and cardiovascular outcomes.
We calculated pairwise correlations among 20 metals and created a correlationmatrix heat map. We applied logarithmic transformation with base 10 to GGT and metal pollutants because the distributions of the raw values were highly skewed and the shapes of dose–response relationships were closer to be linear with logtransformation. We chose log with base 10 rather than natural log for easier interpretation of regression coefficients (i.e., one unit increase in a logtransformed variable is equivalent to a tenfold increase in its raw values). All analyses were conducted using R. A schematic representation of the data accumulation and analytic procedures is presented in Fig. 1.
Construction of ERS using statistical approaches for pollutant mixtures
We utilized adaptive elastic net, Bayesian Additive Regression Trees (BART), Bayesian Kernel Machine Regression (BKMR), and Super Learner (SL) to determine heavy metals contributing to ERS for GGT.

1)
Adaptive elastic net (Zou and Zhang, [7])
Elastic net is a hybrid approach that blends LASSO and ridge regression to overcome the limitation of LASSO on data with highly correlated variables [6]. LASSO and ridge regression are regularized regression techniques that include a penalty term to constrain the size of the estimated coefficients. LASSO shrinks coefficients towards exact zeroes and thus performs variable selection. With highly correlated variables as predictors, LASSO tends to select only one out of these correlated variables and ignore the others. Elastic net also executes variable selection, but it has the opportunity to select a group of nonzero collinear variables. Adaptive elastic net, proposed by Zou and Zhang [7], is an adaptive version of elastic net that not only deals with the collinearity problem over elastic net but satisfies the asymptotic normality assumption that allows us to conduct statistical inference and hypothesis testing by providing large sample standard errors (SEs) and pvalues. Adaptive weights ensure smaller coefficients are shrunk faster to zero, whereas larger coefficients are penalized less.
For subject i (i = 1, … , N), let Y _{ i } represent the continuous outcome GGT, \( {E}_i^j\ \left(j=1,\dots, p\right) \) be the jth given environmental pollutant, and Z _{ i } (k × 1) be the vector of covariates/confounders that the model adjusts for, i.e. age, BMI, urinary creatinine, gender, race/ethnicity, smoking status and education. Let X_{ i } denote all predictors including Z _{ i }, E _{ i } and E _{ i } × E _{ i } . Then the estimates from the adaptive elastic net method are defined by
where \( {\widehat{\omega}}_j \) represents the weight of jth metal pollutant, which can be determined by the beta coefficient from elastic net. These weights can allow coefficients of relatively less important variables (metals) to be shrunk to zero’s more efficiently. Optimal tuning parameters (λ_{1} and λ_{2}) were chosen based on 5fold crossvalidations to minimize prediction errors. In our application we chose not to conduct variable selection on Z _{ i } , and thus we did not penalize the coefficients associated with Z _{ i }.
We constructed ERS using the adaptive elastic net with an underlying model considering not only the main effects but all possible combinations of pairwise linear interactions (AENETI). For comparison purposes, we also ran adaptive elastic net with only main effects (AENETM), which estimate marginal associations between individual variables within a mixture and the outcome. However, in presence of complex nonlinear interaction among the predictors, it is hard to assess the marginal effect of a single pollutant without referencing to the value of the other pollutants. The ‘statistical’ interactions in AENETI imply departures from additive joint effects. ERS was computed as a weighted sum of the selected nonzero predictors from each model:
where \( \widehat{\beta_j} \) is the coefficient of environmental pollutant predictor j and \( \widehat{\beta_{kl}} \) is the coefficient of the interaction of environmental pollutant predictors k and l. Note that most coefficients are zero since these coefficients are shrunk by AENET. All models were adjusted for age, sex, race/ethnicity, education, smoking status, BMI, and urinary creatinine. The R package gcdnet (version 1.0.4) [30] was used to implement adaptive elastic net. The variables with nonzero coefficients were identified as noteworthy/important and their corresponding coefficient estimates and Pvalues were computed.

2)
Bayesian Additive Regression Trees (BART) (Chipman et al. [14])
BART is a Bayesian “sumoftrees” model for ensemble inference from a large number of trees [14]. First, we regress Y _{ i }∣Z _{ i } and E _{ i }∣Z _{ i }, let \( {Y}_i^{\ast } \) and \( {E}_i^{\ast } \) be the residuals obtained from these regression. Then consider the following model.
where f is an unknown function and \( {Y}_i^{\ast } \) is the residual of GGT adjusted for all the covariates Z _{ i }. \( {E}_i^{\ast } \) is the residual of all the metal levels adjusted for Z. We consider to model f(E ^{∗}) by a sum of m regression trees \( f\left({E}^{\ast}\right)\approx h\left({E}^{\ast}\right)\equiv {\sum}_{j=1}^m{g}_j\left({E}^{\ast}\right) \) where each g _{ j } denotes the jth regression tree.
A sumoftrees model is basically an additive model with multiple components. It not only estimates interaction effects, but also takes additive effects into account. Therefore parametric pairwise E x E interaction terms were not specifically included in the BART models (also not in BKMR and SL below). The key idea of BART is to implement the sumoftrees model with a prior that regularizes the fit by keeping the individual tree effects small. At each Markov chain Monte Carlo (MCMC) iteration, we produced a draw from the joint posterior (f, σ)∣(Y, E, Z) and iterated with Gibbs sampling until the convergence with a default choice of m = 200. BART was implemented with R package BayesTree (version 0.31.4) [31]. ERS using BART was constructed as a posterior predicted residual of GGT (we use the posterior mean) for subject i after removing the effect of Z by
The variable inclusion proportions were computed in the training set as a measure of the variable importance using the R package bartMachine (version 1.2.3) [32].

3)
Bayesian kernel machine regression (BKMR) (Bobb et al. [11])
The main idea of kernel machine regression (KMR) is to flexibly model the relationship between a large number of variables and a particular response variable. Our modeling framework is
where ε _{ i } ~ N(0, σ ^{2}) and h is a flexible function of metal pollutants E _{ i }, which is characterized through a kernel machine representation. Usually, Gaussian kernel is the default choice, which flexibly captures a wide range of underlying functional forms for h, and can be expressed as
Here E _{ i } and E _{ i′} represent vectors of predictors for two different individuals, and r _{ j } > 0 denotes the tuning parameter that control the smoothness of h as a function of the exposure E. Intuitively, the kernel function shrinks the predicted response of two individuals with similar exposure profiles toward each other. BKMR is based on KMR and conducts Bayesian inference for the model above [11]. Flat priors for the parameters were chosen in this study. Using MCMC methods, we iterated for 2000 times to ensure the convergence of the method and get the optimal parameter estimates (r, β). BKMR was implemented with R package bkmr (version 0.2.0) [33]. ERS using BKMR was constructed as a predicted value of GGT for subject i by
The posterior inclusion probabilities of each variable were computed in the training set as a measure of variable importance.

4)
Super Learner (SL) (van der Laan et al., [27])
The main idea of SL is to find the optimal prediction by using a combination of predictions from a collection of given algorithms to minimize crossvalidation risk [26]. R package SuperLearner (version 2.0.21) [34] was used to predict the outcome (here GGT) through a 10fold crossvalidation. The SL algorithm includes estimations from many candidate learners L. In our study, we utilized the following learners: tree based methods (R packages: bartMachine, caret, randomForest), regression based methods (glm, gam, step, glm.interaction), shrinkage (regularized regression) methods (ridge, glmnet), Bagging and Boosting methods (xgboost, ipredbagg) and others (nnls, SVM).
Suppose we consider M input algorithms/methods for constructing the ensemble predictor. For a given method m (m = 1, 2, 3, … , M), we fit a model of the form Y ^{∗} = ψ _{ m }(E ^{∗}), where Y ^{∗} is the residual of GGT adjusted for all the covariates Z and E ^{∗} is the residual of all the metal levels after regressing on Z. Each method provides a prediction of the form \( \widehat{Y_m^{\ast }}= \) \( {\widehat{\psi}}_m\left({E}^{\ast}\right) \). We restrict our attention to candidate ensemble predictors of the form
The set of weights corresponding to each method a = {a _{ m }, m = 1⋯, M} is now determined by a Vfold crossvalidation. For each of the v = 1 , ⋯ , V fold, the entire learning set of size n is divided into binary splits of training and validation set that are mutually exclusive. For the vth fold, we use the indicator (\( {\mathrm{B}}_n^v(i)=0 \)) to denote observations in the training set and \( \Big({\mathrm{B}}_n^v(i)=1 \)) to denote observations in the validation set. Let n ^{test}(v) denote the size of the test or validation sample in the vth fold. The vector a is obtained by estimating \( {\widehat{\psi}}_m \) on the training data but evaluating the risk on the test data and minimizing the Vfold crossvalidated prediction loss.
The final SL fit, i.e.,ERS_{ SL }is computed by combining \( \widehat{a} \) with \( {\widehat{\psi}}_m\left({E}^{\ast}\right) \), m = 1 , 2 , … , M:
There is no single consensus measure to evaluate variable importance in the SL. We therefore, computed a sensible metric in the training set as done in BART and BKMR. First we ran SL with all 20 metals and calculated the sum of squared errors (SSE). We then removed one metal, ran SL with the remaining 19 metals and calculated SSE [SSE(−i)]. We repeated this for each metal. The variable importance was computed as the difference between SSE(−i) and SSE divided by SSE [i.e., (SSE(−i) – SSE)/SSE]. For comparison purposes, we also compared individual algorithms within SL in terms of the model prediction performance.
Assessment of predictive power of ERS
To evaluate and compare the performance of 5 ERS’s (AENETI, AENETM, BART, BKMR, and SL), we randomly split the full data (all cycles combined) by a 1∶1 ratio: the first part (n = 4832) used for estimation/training and the second part (n = 4832) for validation/testing. We repeated this random split three times and constructed ERS each time but the results (i.e., selected predictors or variable importance) were consistent. We, therefore, report the results based on the first split data. Three metrics were computed: First, we used linear regression for GGT and fit ERS as a continuous variable and computed correlation coefficients between GGT and ERS and the predicted residual sums of squares (PRESS), a statistic measuring model goodness of fit. Mean squared error (MSE) in the training data and mean squared prediction error (MSPE) in the testing data were also calculated to assess prediction performance. Second, we dichotomized GGT at the 90th percentile (50 U/L), and conducted logistic regression analysis with this dichotomized outcome and continuous ERS as predictor. We used area under the receiver operating characteristic (ROC) curve or AUC to assess predictive ability of the ERS with these binary endpoints. Third, in order to assess risk stratification/discrimination power of the ERS, we further categorized ERS by its quintiles based on the distribution in the training set and conducted logistic regression with categorical ERS in the testing set. We computed the odds ratio (OR) for the highest quintile vs. the lowest quintile of ERS to measure the risk stratification properties of ERS.
Associations between ERS and cardiovascular endpoints
Next, we examined the associations between each ERS and cardiovascular endpoints, such as SBP, DBP, hypertension, and mortality (total and cardiovascular). Survey linear regression, survey logistic regression and survey Coxproportional hazard models were used for SBP/DBP, hypertension, and mortality, respectively. Age was used as the time scale in Coxproportional hazard models [35]. Using age as the timescale implies delayed entry with left truncation occurring at the age at inclusion. In this approach, the hazard function can be directly interpreted as the agespecific incident function. We also examined cancer mortality which is a noncardiovascular endpoint but related to oxidative stress as a potential biological mechanism. Separate Cox models were fit for each causespecific outcome and log(hazard ratio (HR)) was obtained from each Cox model. The same covariates used above were adjusted for all models. To standardize the distributions of different ERS’s, we computed zscores for each ERS by subtracting the mean of the corresponding ERS divided by its standard deviation (SD). We report β coefficients (95% confidence intervals (CIs)) for SBP/DBP, ORs for hypertension, and HRs for mortality outcomes for a oneunit increase in the zscore of ERS which is equivalent to a one SD increase in its original scale.
Results
Table 1 presents population characteristics for the whole and by each cycle. Mean (SD) of age was 49.2 (17.2) years and approximately 51% were female. Overall, the proportions of Mexican American, other Hispanic, NonHispanic white, NonHispanic black, and other race/ethnicity were 16.7%, 8.1%, 46.6%, 20.4%, and 8.2%, respectively. Starting in the cycle of 2007–2008, all Hispanics were oversampled, not just Mexican Americans, thus the proportion of other Hispanic increased since this cycle. Likewise, Asian Americans were oversampled starting in the cycle of 2011–2012. There were decreasing trends in the prevalence of current smokers and low education attainment (<high school diploma) over the study period. The means (SDs) of GGT, SBP and DBP were 21.7 (1.94) U/L, 123.2 (18.7) mm Hg, and 70.0 (12.0) mm Hg, respectively. The prevalence of hypertension was 36.9%. Over the mean of 4.5 years of followup, the mortality rates of all causes, CVD and cancer were 5.6%, 1.5% and 1.4%, respectively.
Table 2 shows geometric means (GMs) and geometric standard deviations (GSDs) of metals measured in either whole blood or urine overall and by each cycle. Most metals showed that GMs were higher than GSDs, suggesting skewness. There were decreasing trends in the populationlevel concentrations of lead in both blood and urine. Urinary arsenobetaine concentrations increased in the cycles of 2011–2012 and 2013–2014, which may result from the increased proportion of Asians who eat more fish (arsenobetaine is mainly from fish consumption). Figure 2 shows a heat map of Spearman correlations between metal biomarkers. There were moderate to high correlations between blood lead and blood cadmium (Spearman correlation (rho) = 0.35); between blood total mercury and urinary arsenobetain (rho = 0.47); and among arsenic species (monomethylarsonic acid (MMA), arsenous acid (As III), arsenic acid (As V), and arsenocholine: rho = 0.43 to 0.87). A group of urinary metals including total arsenic, dimethylarsonic acid (DMA), cadmium, lead, cobalt, thallium, cesium, barium, uranium, tungsten, and molybdenum also had modest to high correlations each other (rho = 0.22 to 0.82).
Figure 3 and Additional file 1: Table S2 shows the variable selection results for AENETM and AENETI. Nine predictors (main effects) were selected in AENETM and 24 predictors (13 main effects and 11 pairwise interactions) were selected in AENETI. In AENETI, important metals selected include lead (blood) (β = 0.31, p = 6.2 × 10^{−6}), cadmium (urine) (β = 0.22, p = 9.8 × 10^{−7}), DMA (β = 0.28, p = 5.9 × 10^{−5}), cobalt (β = −0.14, p = 1.2 × 10^{−8}), MMA (β = −0.10, p = 1.3 × 10^{−10}), and barium (β = 0.05, p = 7.9 × 10^{−10}) in terms of either a large magnitude of the association (β > 0.1) or a small pvalue (<1 × 10^{−5}). Interactions of cesiumlead (blood) (β = −0.15, p = 4.5 × 10^{−4}) and of mercuryAs V (β = −0.15, p = 6.1 × 10^{−2}) were also identified. In BART, urinary cadmium seems to contribute to prediction of ERS most (variable importance (the relative proportion of the contribution) = 9.49%) followed by cobalt (7.21%), DMA (7.14%), tungsten (7.13%), MMA (6.78%) and so on (Additional file 1: Table S3). In SL, As V had the highest variable importance measure (25.1%) followed by MMA (24.2%), tungsten (22.9%), total As (22.6%), DMA (22.6%), and cadmium (21.5%) (Additional file 1: Table S3). Combining the ranks of variable importance between BART and SL, cadmium, tungsten, MMA, DMA, As V, barium and cobalt were ranked as the top 7 metals (Additional file 1: Table S3). In BKMR, almost all metals had posterior inclusion probabilities (PIP) of 1 except arsenous acid (As III, PIP = 0.96) and arsenic acid (As V, PIP = 0.85) (data not shown).
The ERS’s from each statistical approach ranged from −0.22 to 0.25 for AENETM; −0.26 to 0.49 for AENETI; −0.22 to 0.45 for BART; −1.01 to 1.83 for BKMR; and −0.18 to 0.27 for SL (Table 3). ERS from BKMR had a wider range whereas ERS from SL had a narrower range than other approaches. Pairwise correlations among AENETI, BART and SL were relatively high (>0.6), whereas BKMR had weak correlations with other approach (<0.12) (Additional file 1: Figure S1).
For risk prediction performance for continuous GGT, AENETI outperformed other approaches in terms of correlation between GGT and ERS, PRESS, and MSPE in the testing set (Table 3). The correlation coefficients between GGT and ERS in the training set were 0.22 in AENETM and 0.24 in AENETI and similar and slightly improved correlation coefficients were obtained in the testing set. The correlation coefficient between GGT and ERS in BART was 0.35 in the training set and 0.20 in the testing set. BKMR and SL overfit GGT prediction in the training set (correlation coefficient = 0.82 in BKMR and 0.75 in SL). BKMR poorly predicted GGT in the testing set (correlation coefficient = 0.0) but SL performed reasonably in the testing set (correlation coefficient = 0.20). For dichotomous GGT in the testing set, AENETI and SL performed best (Table 3 and Additional file 1: Figure S2). The addition of their ERS modestly improved the AUC for GGT (AUC from 0.66 to 0.70 for both). Again, the same overfitting issue was observed for the ERS from BKMR in the training set.
Figure 4 shows ORs of having high GGT levels (50 U/L) comparing the highest vs. the lowest quintiles of ERS. After controlling for base covariates, ORs of having high GGT levels ranged from 1.69 (95% CI, 1.20 to 2.38) for BKMR, 3.42 (95% CI, 2.45 to 4.78) for BART, 3.62 (95% CI, 2.49 to 5.25) for AENETM, 3.73 (95% CI, 2.66 to 5.23) for SL, to 4.29 (95% CI, 2.97 to 6.18) for AENETI. These ORs were even stronger than those for individual metals. ORs comparing the highest vs. the lowest quintile of lead, cadmium, mercury and total arsenic were 1.71 (95% CI, 1.20 to 2.44), 1.17 (95% CI, 0.79 to 1.74), 1.29 (95% CI, 0.95 to 1.75), 1.54 (95% CI, 1.06 to 2.22), respectively. Furthermore, the ERS effect sizes (except ERS for BKMR) were stronger than those for sociodemographic factors or BMI. For example, OR for females vs males was 2.16 (95% CI 1.74, 2.67); OR for 10 kg/m^{2} increase in BMI was 1.44 (95 CI, 1.25, 1.65); and OR for <high school vs. college or higher was 1.76 (95% CI, 1.26, 2.46) (data not shown).
Table 4 presents the associations between ERS and various cardiovascular endpoints. Because the ERS prediction performance for BKMR was not good, we excluded the BKMRbased ERS in these analyses. All ERS’s showed significant associations with systolic and diastolic blood pressure and hypertension. ERS based on SL showed the largest associations: one SD increase in SLERS was associated with a 1.03 mmHg (95% CI, 0.57, 1.48) higher in SBP; a 1.61 mmHg (95% CI, 1.28, 1.95) higher in DBP. For hypertension, one SD increase in each ERS from AENETI, BART and SL were associated with ORs of 1.26 (95% CI, 1.15, 1.38), 1.17 (1.09, 1.25), and 1.30 (1.20, 1.40), respectively. For comparison, blood lead and blood cadmium were associated with ORs of 1.08 (0.99, 1.18) and 1.06 (0.98, 1.16). ERS’s showed nonsignificant positive associations with both total and CVD mortality. Positive associations with cancer mortality were observed for all ERS’s but those associations did not reach statistical significance except AENETM (HR = 1.50, 95% CI, 1.05, 2.15).
Discussion
There is growing interest in evaluating health effects of reallife environmental exposure as mixtures. Scientists have recognized the need to develop novel statistical approaches to predicting disease risk associated with exposure to pollutant mixtures [19]. The present study proposes an updated ERS [20] using crosscutting statistical approaches developed to handle statistical challenges in pollutant mixtures: high degrees of correlations among pollutants; pollutantpollutant interactions; and cumulative risk from pollutant mixtures [1]. Among the ERS’s constructed to predict GGT, an indicator of oxidative stress, AENETI, a regularized regressionbased variable selection method that considered both main effects and pairwise interactions, performed best. Despite their slight worse performance in prediction, BART, a nonparametric Bayesian regression approach that uses dimensionally adaptive random basis elements [14], and SL, an ensemble of multiple learners [26], in which evaluation of interaction effects are built, showed comparable prediction performance only with the main effect terms in their models. The observed associations between ERS and GGT were larger than those for known toxic metals assessed individually, suggesting that cumulative (combined) effects may be larger than each individual effect. We also found significant associations between ERS and DBP and hypertension, implying that individuals who have higher metal mixturesrelated oxidative stress may be at higher risk for high blood pressure.
Choice of statistical approaches is a critical step in research on multipollutants and mixtures. As listed in Introduction, a number of different statistical approaches have been developed to analyze high dimensional and correlated data. A workshop by NIEHS in 2015, entitled “Statistical Approaches for Assessing Health Effects of Environmental Chemical Mixtures in Epidemiology Studies”, reported that no one statistical approach performs better than another and a statistical approach should be chosen based on a specific scientific question and hypothesis related to pollutant mixtures [19]. Our primary goal and specific question when constructing ERS using 4 different statistical approaches was to develop better prediction models for early health effects on a common disease pathway (i.e., oxidative stress) while accounting for high degree correlations and potential interactions among metals. Then these prediction models are used to assess their risk stratification and discrimination power for predicting specific health endpoints. The statistical approaches we used and other approaches listed in Introduction are ‘agnostic’ and the selected predictors do not necessarily reflect causal agents or biological interactions [36]. However, when predictive modeling is the primary goal rather than causal explanatory modeling, causality of the selected predictors may be of less importance if highly correlated pollutants are examined [37]. Either causal predictors or their correlated proxies should have similar predictive power and in general, variables with less noise are likely to be selected [38]. Biologicallybased dose–response functions, such as physiologically based pharmacokinetic (PBPK) modeling, may better capture causal explanation but require more complex modeling that is hard to test and toxicological data that are often not available. Our predictive modeling can provide a good approximation of underlying causal relationships by capturing complex relationships and patterns among pollutants within mixtures [39].
In terms of choice of statistical models, BKMR was found to be the least “scalable” among the methods we considered. AENETI has the advantage of providing important variables and working in a parametric regression structure and works well under sparsity and largepsmalln situation. BART and BKMR are nonparametric methods based on kernel smoothing and recursive partitioning but have difficulty with a truly ultradimensional covariate space. SL is a highly flexible and powerful method that can incorporate any prediction generating tool as an ingredient and dataadaptively weigh the predictions. The limitation is that it does not provide measures of variable importance or carry out variable selection. All four methods can handle modest to strong correlation among predictors. While we have allowed our construction of ERS to be driven by statistical considerations, it may be worthwhile to group the pollutants by known biological mechanisms. Bayesian methods and groupedLASSO type methods have the potential of incorporating this information through a priori grouping and incorporation of priors. Contrasting datadriven methods with biologydriven methods and combining epidemiology with biology in an optimal way remain a relevant research topic.
AENET most likely provides biased coefficients because of shrinkage estimation. Some studies attempted to obtain unpenalized coefficients by refitting multipleexposure OLS regression models with selected variables post Enet selection [17]. In general, it is wellknown that running OLS post model selection with the selected variables, using the same dataset leads to high false discovery rates and overoptimistic results. Thus we refrain from using ERS that is constructed by fitting OLS post modelselection. Fitting OLS on a new independent dataset with the selected variables is a valid strategy. There are new variants of LASSO estimates, for example the debiased LASSO that can handle this issue better but are still in developing stages of implementation. With advances in postmodel selection theory, we would have better inferential tools under penalized variable selection methods like AENET [40].
One important approach we attempted is to use an oxidative stress marker as the outcome. A challenge for constructing the cumulative effects of multiple pollutant exposure like ERS is that they are diseasespecific. ERS constructed to predict CVD may not predict cancer risk well. ERS for common disease pathways, such as oxidative stress, inflammation, epigenetic modification, and endocrine disruption, may capture cumulative early biological effects and discriminate individuals who are at increased risk of manifesting various downstream clinical diseases. GGT has been suggested an oxidative stress marker [24, 25] and widely used in largescale epidemiologic research [41,42,43]. Markers that directly measure oxidative damages in vivo such as F2isoprostane [44, 45] may improve risk prediction and thus better discriminate metal mixture health risk. Our analyses of linking ERS for oxidative stress and clinical outcomes suggest that this cumulative risk score is a useful tool to predict the clinical outcomes related to oxidative stress. One limitation is that given its nonspecificity, the ERS in the current study may not capture disease risk for complex diseases as good as the ERS constructed for a specific disease. A more targeted ERS for a specific disease will need to be built with different weights.
Cardiovascular effects of heavy metals, especially lead, cadmium and arsenic, have long been acknowledged and examined in epidemiologic research. A recent expert review by the National Toxicology Program found sufficient evidence to conclude that lead exposure, even at low exposure (<10 μg/dL), is associated with elevated blood pressure and hypertension but the evidence is limited or inadequate for other cardiovascular endpoints [46]. For arsenic, several systematic reviews suggest there is evidence for a causal association between high longterm arsenic exposure and CVD endpoints; however, conclusive evidence is lacking for an association with lowtomoderate arsenic levels [47, 48]. For cadmium, chronic exposure may be an independent risk factor for CVD, but further studies with individual level exposure and standardized CVD outcome assessments and accounting for confounding by smoking are needed to establish a causal association [49]. For other nonessential metals, only a few epidemiologic studies were conducted in relation to CVD [50,51,52,53,54]. A recent systematic review concluded that the current evidence is insufficient to support the causality of the association between environmental metals and CVD because of the small number of studies [55].
Although individual metal effects were not our primary interest, the metals identified in the multipollutant models deserve further consideration. Our multipollutant models (AENETI, BART and SL) confirmed that those historically established metals (lead, cadmium and arsenic as methylated forms (MMA and DMA)) are important predictors. Interestingly, these metals turned out to have important pairwise interactions with other metals: for lead, DMA, cesium, molybdenum, and tungsten; for cadmium, cesium and uranium; for DMA, molybdenum and uranium. Cobalt was selected as the second most important variable in BART. Our AENET models suggested an inverse association between cobalt and GGT. This may not be surprising given that cobalt is an essential trace element and is part of vitamin B12 (cobalamin) which is vital to maintain human health [56]. The levels of cobalt typically found in the environment is not known to be harmful [57]. However, recent studies conducted using NHANES data found significant positive associations of cobalt with cardiovascular and cerebrovascular disease [50] and agerelated impaired mobility [58]. Agarwal et al. [50] also reported a significant positive association between cardiovascular and cerebrovascular disease and tungsten which was selected as a main effect as well as interaction with lead in AENETI and as the fourth most important variable in BART. Tungsten, commonly used in metal alloys [59], has also been positively associated with diabetes [60] and peripheral artery disease [54]. Barium, a silverywhite metal used for many different purposes in industry [61], was also selected in our multipollutant models. Urinary barium has been associated with higher insulin resistance [60], obesity [62], and lower thyroid hormone levels [63]. It should be noted that these findings are subject to reverse causality given the nature of crosssectional design and that phenotypes such as diabetes may increase metal excretions in urine [64].
The present study has numerous limitations. We examined only metal mixtures. In real world, we are exposed to many other environmental pollutants that may cause oxidative stress leading to CVD and other health consequences. Therefore, the associations found here are still subject to potential confounding by copollutants other than metals. Because biomonitoring data in NHANES were not measured in all participants (i.e., some pollutants were measured only in a subset (e.g., urinary metals were measured in one third of the participants in each cycle) and different classes of pollutants were measured in different subsets in order to reduce the burden of examinations), we were not able to include other classes of pollutants in this study. Although assessment of a large number of pollutants from different classes is costly and a challenging task, it is a prerequisite for evaluating the impact of pollutant mixtures on human health.
Relatedly, the environment used in our study is limited to chemical environmental pollutants. Nonchemical stressors such as behavioral and psychosocial factors also play an important role in chronic disease risk independently as well as in conjunction with chemical pollutants. Our approach was to treat these factors as confounders, i.e., to separate the effects of nonchemical stressors out of those of the chemical pollutant mixtures. ERS can be expanded to a broader range of the environment encompassing various nongenetic factors as well as chemicalnonchemical interactions depending on investigator’s points of view and research questions. As the concept of ‘exposome’, defined as the totality of exposures over the course of a lifetime [65, 66], is emerging, our proposed approach will be useful to integrate health risks from a wide range of environmental factors identified by novel statistical approaches beyond the conventional linear regressionbased modeling.
In AENET algorithms, we assumed linear (in fact, loglinear) exposureoutcome relationships and linear interactions. However, nonlinear, nonmonotonic dose–response relationships, for example, lowdose effects of endocrine disrupting chemicals [67], may exist. Due to such model misspecification, important variables may not be selected in AENET and not contribute to the corresponding ERS. It should be noted that the selected pairwise interactions in AENETI may reflect nonlinear dose–response relationships rather than biological interactions. In fact, in presence of nonlinearity, a complex linear model attempts to capture quadratic or higher order main effect terms by estimating a nonlinear interaction term [68].
In addition, for AENET and BKMR, our strategy has been to always include the covariates/confounders Z in the predictor space and not penalize coefficients associated with Z. For BART and SL we first regressed GGT on Z and each of the E’s on Z and used the residuals from these models as the input variables for outcome and predictors. We are aware that this does not reproduce the model YE, Z. This is driven by limitations in the implementation software. By adopting this approach, we also assume that the covariates have a linear relationship with GGT without any nonlinearity or interactions within the set of Z and ignore metal by covariate interaction terms of the form E*Z. Future extensions can consider this possibility and attempt to treating Z uniformly across the methods.
The primary goal of epidemiologic research is to identify causal factors and causal structures. Although a hypothesisdriven knowledgebased approach is a standard practice to build the causal structure in epidemiologic research, an agnostic datadriven machine learning approach may be useful when limited knowledge of the causal structure is available. Many environmental chemicals, especially emerging new toxicants, have little information on underlying biological mechanisms and toxicological and epidemiological evidence that preclude us from incorporating the causal structure to the analysis. Datadriven machine learning approaches used in the present study can help us identify causal structure among environmental chemicals within a mixture. Of note, it is of less interest in public health whether the selected chemicals are causal or markers of other causal factors when those chemicals are highly correlated because regulations targeted on one chemical would likely control the other chemicals. Another important role of constructing an ERS is to study effect modification of summary exposure by other factors (for example genetic factors) and provide risk stratification in the population. For example, ERS could be used as a summary measure in a followup geneenvironment interaction analyses or to study the effect of epigenetic changes on environmental exposure.
As mentioned earlier, we constructed an ERS for oxidative stress as a common pathway for multiple diseases to overcome the construction of multiple diseasespecific ERSs. However, complex diseases such as CVD involve multiple disease pathways, and therefore, our proposed ERS may capture only partial disease risk and can be less appropriate if the selected pathway is a minor one in disease etiology. Prior knowledge regarding underlying biological mechanisms is a key prerequisite to conduct the proposed work. Ideally, multiple ERS representing each plausible biological pathway could be constructed and adaptively weighted to provide a more comprehensive risk assessment for a complex chronic disease.
Each individual metal measured in either whole blood or urine has different halflives and therefore, may not necessarily capture the actual exposure concentrations relevant to the current disease risk. Most urinary metals used in this study have short halflives and thus their urinary concentrations reflect exposures happened over the past hours to days. Such metals may have not been detected to be associated with GGT in our analysis and could be missed in relation to other health endpoints if longterm cumulative exposure to low levels cause those health outcomes. Another source of exposure measurement error is LODs of all metal biomarkers. For simplicity, we used a conventional ad hoc method of imputing the concentrations below LOD with LOD/√2, but it is known to result in bias especially when the proportion of concentrations below LOD is high [27]. These exposure measurement errors are generally nondifferential and lead to bias towards the null if each exposure is evaluated with the outcome individually in single pollutant models. However, bias would be more complicated and even differential measurement errors may occur in multipollutant models when multiple pollutants have different degrees of measurement errors and those measurement errors are not independent because some of the effects of more poorly assessed pollutants may be transferred to the effect estimates of better assessed pollutants [69].
Conclusions
In summary, the current study suggests that ERS is a useful tool for characterizing cumulative risk from pollutant mixtures, with accounting for statistical challenges such as high degrees of correlations and pollutantpollutant interactions. ERS constructed for an intermediate marker like GGT is predictive of related disease endpoints. This new approach of multipollutant framework will help better understand the realworld health impacts of pollutant mixtures and facilitate risk stratification and targeted preventive intervention [20]. To overcome the methodological challenges discussed here and raised in the recent NIEHS workshop [19], collective work such as systematic assessment of a wide range of pollutants in well characterized cohorts and creation of a mixture data consortium will need to be done.
Abbreviations
 AENETI:

Adaptive elasticnet with main effects and pairwise interactions
 AENETM:

Adaptive elasticnet with only main effects
 As III:

Arsenous acid
 As V:

Arsenic acid
 AUC:

Area under the receiver operating characteristic
 BART:

Bayesian additive regression tree
 BKMR:

Bayesian kernel machine regression
 BMA:

Bayesian model averaging
 BMI:

Body mass index
 CART:

Classification and regression tree
 CI:

Confidence interval
 CVD:

Cardiovascular disease
 DBP:

Diastolic blood pressure
 DMA:

Dimethylarsonic acid
 ERS:

Environmental risk score
 GGT:

Gammaglutamyl transferase
 GM:

Geometric mean
 GSD:

Geometric standard deviation
 HR:

Hazard ratio
 ICD:

International Classification of Disease
 ICPDRCMS:

Inductively coupledplasma dynamic reaction cellmass spectrometry
 KMR:

Kernel machine regression
 LASSO:

Least absolute shrinkage and selection operator
 LOD:

Limit of detection
 MCMC:

Markov chain Monte Carlo
 MMA:

Monomethylarsonic acid
 MSE:

Mean squared error
 MSPE:

Mean squared prediction error
 NHANES:

National Health and Nutrition Examination Survey
 NIEHS:

National Institute of Environmental Health and Sciences
 OR:

Odds ratio
 PBPK:

Physiologically based pharmacokinetic
 PCA:

Principal component analysis
 PIP:

Posterior inclusion probabilities
 PLS:

Partial least squares
 PRESS:

Predicted residual sums of squares
 ROC:

Receiver operating characteristic
 SBP:

Systolic blood pressure
 SD:

Standard deviation
 SE:

Standard error
 SL:

Super Learner
 SSE:

Sum of squared errors
 WQS:

Weighted quantile sum
References
 1.
Braun JM, Gennings C, Hauser R, Webster TF. What Can Epidemiological Studies Tell Us about the Impact of Chemical Mixtures on Human Health? Environ Health Perspect. 2016;124(1):A6–9.
 2.
ChadeauHyam M, Campanella G, Jombart T, Bottolo L, Portengen L, Vineis P, Liquet B, Vermeulen RC. Deciphering the complex: methodological overview of statistical models to derive OMICSbased biomarkers. Environ Mol Mutagen. 2013;54(7):542–57.
 3.
Sun Z, Tao Y, Li S, Ferguson KK, Meeker JD, Park SK, Batterman SA, Mukherjee B. Statistical strategies for constructing health risk models with multiple pollutants and their interactions: possible choices and comparisons. Environ Health. 2013;12(1):85.
 4.
Billionnet C, Sherrill D, AnnesiMaesano I. Estimating the health effects of exposure to multipollutant mixture. Ann Epidemiol. 2012;22(2):126–41.
 5.
Tibshirani R. Regression Shrinkage and Selection via the Lasso. J R Stat Soc Ser B Methodol. 1996;58(1):267–88.
 6.
Zou H. The Adaptive Lasso and Its Oracle Properties. J Am Stat Assoc. 2006;101(476):1418–29.
 7.
Zou H, Zhang HH. On the Adaptive ElasticNet with a Diverging Number of Parameters. Ann Stat. 2009;37(4):1733–51.
 8.
Wold S, Ruhe A, Wold H, Dunn WJ. The Collinearity Problem in Linear Regression. The Partial Least Squares (PLS) Approach to Generalized Inverses. SIAM J Sci Stat Comput. 1984;5(3):735–43.
 9.
Carrico C, Gennings C, Wheeler DC, FactorLitvak P. Characterization of Weighted Quantile Sum Regression for Highly Correlated Data in a Risk Analysis Setting. J Agric Biol Environ Stat. 2015;20(1):100–20.
 10.
Madigan D, Raftery AE. Model selection and accounting for model uncertainty in graphical models using Occam’s window. J Am Stat Assoc. 1994;89(428):1535–46.
 11.
Bobb JF, Valeri L, Claus Henn B, Christiani DC, Wright RO, Mazumdar M, Godleski JJ, Coull BA. Bayesian kernel machine regression for estimating the health effects of multipollutant mixtures. Biostatistics. 2015;16(3):493–508.
 12.
Breiman L, Friedman JH, Olshen RA, Stone CJ. Classification and regression trees: Wadsworth Statistics/Probability. Boca Raton: Chapman and Hall/CRC; 1984.
 13.
Tin Kam H. The random subspace method for constructing decision forests. IEEE Trans Pattern Anal Mach Intell. 1998;20(8):832–44.
 14.
Chipman HA, George EI, McCulloch RE. BART: Bayesian additive regression trees. Ann Appl Stat. 2010;4(1):266–98.
 15.
Czarnota J, Gennings C, Wheeler DC. Assessment of weighted quantile sum regression for modeling chemical mixtures and cancer risk. Cancer Inform. 2015;14(Suppl 2):159–71.
 16.
Forns J, Mandal S, Iszatt N, Polder A, Thomsen C, Lyche JL, Stigum H, Vermeulen R, Eggesbo M. Novel application of statistical methods for analysis of multiple toxicants identifies DDT as a risk factor for early child behavioral problems. Environ Res. 2016;151:91–100.
 17.
Lenters V, Portengen L, RignellHydbom A, Jonsson BA, Lindh CH, Piersma AH, Toft G, Bonde JP, Heederik D, Rylander L, et al. Prenatal Phthalate, Perfluoroalkyl Acid, and Organochlorine Exposures and Term Birth Weight in Three Birth Cohorts: MultiPollutant Models Based on Elastic Net Regression. Environ Health Perspect. 2016;124(3):365–72.
 18.
Pang Y, Peng RD, Jones MR, Francesconi KA, Goessler W, Howard BV, Umans JG, Best LG, Guallar E, Post WS, et al. Metal mixtures in urban and rural populations in the US: The MultiEthnic Study of Atherosclerosis and the Strong Heart Study. Environ Res. 2016;147:356–64.
 19.
Taylor KW, Joubert BR, Braun JM, Dilworth C, Gennings C, Hauser R, Heindel JJ, Rider CV, Webster TF, Carlin DJ. Statistical Approaches for Assessing Health Effects of Environmental Chemical Mixtures in Epidemiology: Lessons from an Innovative Workshop. Environ Health Perspect. 2016;124(12):A227–9.
 20.
Park SK, Tao Y, Meeker JD, Harlow SD, Mukherjee B. Environmental Risk Score as a New Tool to Examine MultiPollutants in Epidemiologic Research: An Example from the NHANES Study Using Serum Lipid Levels. PLoS One. 2014;9(6):e98632.
 21.
Bhatnagar A. Environmental cardiology: studying mechanistic links between pollution and heart disease. Circ Res. 2006;99(7):692–705.
 22.
Solenkova NV, Newman JD, Berger JS, Thurston G, Hochman JS, Lamas GA. Metal pollutants and cardiovascular disease: mechanisms and consequences of exposure. Am Heart J. 2014;168(6):812–22.
 23.
Ercal N, GurerOrhan H, AykinBurns N. Toxic metals and oxidative stress part I: mechanisms involved in metalinduced oxidative damage. Curr Top Med Chem. 2001;1(6):529–39.
 24.
Koenig G, Seneff S. GammaGlutamyltransferase: A Predictive Biomarker of Cellular Antioxidant Inadequacy and Disease Risk. Dis Markers. 2015;2015:818570.
 25.
Lee DH, Blomhoff R, Jacobs DR Jr. Is serum gamma glutamyltransferase a marker of oxidative stress? Free Radic Res. 2004;38(6):535–9.
 26.
van der Laan MJ, Polley EC, Hubbard AE. Super Learner. Stat Appl Genet Mol Biol. 2007;6(1):article 25.
 27.
Nie L, Chu H, Liu C, Cole SR, Vexler A, Schisterman EF. Linear regression with an independent variable subject to a detection limit. Epidemiology. 2010;21(Suppl 4):S17–24.
 28.
Barr DB, Wilder LC, Caudill SP, Gonzalez AJ, Needham LL, Pirkle JL. Urinary creatinine concentrations in the U.S. population: implications for urinary biologic monitoring measurements. Environ Health Perspect. 2005;113(2):192–200.
 29.
O’Brien KM, Upson K, Cook NR, Weinberg CR. Environmental Chemicals in Urine and Blood: Improving Methods for Creatinine and Lipid Adjustment. Environ Health Perspect. 2016;124(2):220–7.
 30.
Yang Y, Zou H. Package ‘gcdnet’. 2015. https://cran.rproject.org/web/packages/gcdnet/gcdnet.pdf. 05/10/2017.
 31.
Chipman H, McCulloch R. Package ‘BayesTree’. 2016. https://cran.rproject.org/web/packages/BayesTree/BayesTree.pdf. 05/10/2017.
 32.
Kapelner A, Bleich J. Package ‘bartMachine’. 2016. https://cran.rproject.org/web/packages/bartMachine/bartMachine.pdf. 05/10/2017.
 33.
Bobb JF. Package ‘bkmr’. 2017. https://cran.rproject.org/web/packages/bkmr/bkmr.pdf. 05/10/2017.
 34.
Polley E, LeDell E, Kennedy C, Lendle S, van der Laan M. Package ‘SuperLearner’. 2016. https://cran.rproject.org/web/packages/SuperLearner/SuperLearner.pdf. 05/10/2017.
 35.
Thiebaut AC, Benichou J. Choice of timescale in Cox’s model analysis of epidemiologic cohort data: a simulation study. Stat Med. 2004;23(24):3803–20.
 36.
Khoury MJ, Wacholder S. Invited commentary: from genomewide association studies to geneenvironmentwide interaction studieschallenges and opportunities. Am J Epidemiol. 2009;169(2):227–30. discussion 234–225
 37.
Draper NR, Vannostrand RC. Ridge Regression and JamesStein Estimation  Review and Comments. Technometrics. 1979;21(4):451–66.
 38.
Yuan M, Lin Y. Model selection and estimation in regression with grouped variables. J R Stat Soc Ser B Stat Methodol. 2006;68(1):49–67.
 39.
Shmueli G. To explain or to predict? Stat Sci. 2010;25(3):289–310.
 40.
Bühlmann P, van de Geer S. Statistics for HighDimensional Data: Methods, Theory and Applications. Heidelberg: Springer; 2011.
 41.
Bradley RD, Fitzpatrick AL, Jacobs DR Jr, Lee DH, Swords Jenny N, Herrington D. Associations between gammaglutamyltransferase (GGT) and biomarkers of atherosclerosis: the Multiethnic Study of Atherosclerosis (MESA). Atherosclerosis. 2014;233(2):387–93.
 42.
JanickiDeverts D, Cohen S, Matthews KA, Gross MD, Jacobs DR Jr. Socioeconomic status, antioxidant micronutrients, and correlates of oxidative damage: the Coronary Artery Risk Development in Young Adults (CARDIA) study. Psychosom Med. 2009;71(5):541–8.
 43.
Van Hemelrijck M, Jassem W, Walldius G, Fentiman IS, Hammar N, Lambe M, Garmo H, Jungner I, Holmberg L. Gammaglutamyltransferase and risk of cancer in a cohort of 545,460 persons  the Swedish AMORIS study. Eur J Cancer. 2011;47(13):2033–41.
 44.
Milne GL, Musiek ES, Morrow JD. F2isoprostanes as markers of oxidative stress in vivo: an overview. Biomarkers. 2005;10(Suppl 1):S10–23.
 45.
Roberts LJ, Morrow JD. Measurement of F(2)isoprostanes as an index of oxidative stress in vivo. Free Radic Biol Med. 2000;28(4):505–13.
 46.
NTP. NTP monograph on health effects of lowlevel lead. In: NTP Monogr. 2012/06/01 ed. Research Triangle Park: National Toxicology Program, National Institute of Environmental Health Sciences, U.S. Department of Health and Human Services; 2012. p. xiii. xv148.
 47.
Moon K, Guallar E, NavasAcien A. Arsenic exposure and cardiovascular disease: an updated systematic review. Curr Atheroscler Rep. 2012;14(6):542–55.
 48.
NavasAcien A, Sharrett AR, Silbergeld EK, Schwartz BS, Nachman KE, Burke TA, Guallar E. Arsenic exposure and cardiovascular disease: a systematic review of the epidemiologic evidence. Am J Epidemiol. 2005;162(11):1037–49.
 49.
TellezPlaza M, Jones MR, DominguezLucas A, Guallar E, NavasAcien A. Cadmium exposure and clinical cardiovascular disease: a systematic review. Curr Atheroscler Rep. 2013;15(10):356.
 50.
Agarwal S, Zaman T, Tuzcu EM, Kapadia SR. Heavy metals and cardiovascular disease: results from the National Health and Nutrition Examination Survey (NHANES) 1999–2006. Angiology. 2011;62(5):422–9.
 51.
Guo J, Su L, Zhao X, Xu Z, Chen G. Relationships between urinary antimony levels and both mortalities and prevalence of cancers and heart diseases in general US population, NHANES 1999–2010. Sci Total Environ. 2016;571:452–60.
 52.
Lind PM, Olsen L, Lind L. Circulating levels of metals are related to carotid atherosclerosis in elderly. Sci Total Environ. 2012;416:80–8.
 53.
Mendy A, Gasana J, Vieira ER. Urinary heavy metals and associated medical conditions in the US adult population. Int J Environ Health Res. 2012;22(2):105–18.
 54.
NavasAcien A, Silbergeld EK, Sharrett R, CalderonAranda E, Selvin E, Guallar E. Metals in urine and peripheral arterial disease. Environ Health Perspect. 2005;113(2):164–9.
 55.
Nigra AE, RuizHernandez A, Redon J, NavasAcien A, TellezPlaza M. Environmental Metals and Cardiovascular Disease in Adults: A Systematic Review Beyond Lead and Cadmium. Curr Environ Health Rep. 2016;3(4):416–33.
 56.
Lindsay D, Kerr W. Cobalt closeup. Nat Chem. 2011;3(6):494.
 57.
ATSDR. Toxicological profile for cobalt. Atlanta: Agency for Toxic Substances and Disease Registry (ATSDR), U.S. Department of Health and Human Services, Public Health Service; 2004.
 58.
Lang IA, Scarlett A, Guralnik JM, Depledge MH, Melzer D, Galloway TS. Agerelated impairments of mobility associated with cobalt and other heavy metals: data from NHANES 1999–2004. J Toxicol Environ Health A. 2009;72(6):402–9.
 59.
Keith LS, Wohlers DW, Moffett DB, Rosemond ZA. ATSDR evaluation of potential for human exposure to tungsten. Toxicol Ind Health. 2007;23(5–6):309–45.
 60.
Menke A, Guallar E, Cowie CC. Metals in Urine and Diabetes in U.S. Adults. Diabetes. 2016;65(1):164–71.
 61.
ATSDR. Toxicological profile for barium. Atlanta: Agency for Toxic Substances and Disease Registry (ATSDR), U.S. Department of Health and Human Services, Public Health Service; 2007.
 62.
Padilla MA, Elobeid M, Ruden DM, Allison DB. An examination of the association of selected toxic metals with total and central obesity indices: NHANES 99–02. Int J Environ Res Public Health. 2010;7(9):3332–47.
 63.
Yorita Christensen KL. Metals in blood and urine, and thyroid function among adults in the United States 2007–2008. Int J Hyg Environ Health. 2013;216(6):624–32.
 64.
Chaumont A, Nickmilder M, Dumont X, Lundh T, Skerfving S, Bernard A. Associations between proteins and heavy metals in urine at low environmental exposures: evidence of reverse causality. Toxicol Lett. 2012;210(3):345–52.
 65.
Wild CP. Complementing the genome with an “exposome”: the outstanding challenge of environmental exposure measurement in molecular epidemiology. Cancer Epidemiol Biomark Prev. 2005;14(8):1847–50.
 66.
Wild CP. The exposome: from concept to utility. Int J Epidemiol. 2012;41(1):24–32.
 67.
Rhomberg LR, Goodman JE. Lowdose effects and nonmonotonic doseresponses of endocrine disrupting chemicals: has the case been made? Regul Toxicol Pharmacol. 2012;64(1):130–3.
 68.
Sofer T, Cornelis MC, Kraft P, Tchetgen Tchetgen EJ. Control Function Assisted Ipw Estimation with a Secondary Outcome in Case–control Studies. Stat Sin. 2017;27(2):785–804.
 69.
Zeger SL, Thomas D, Dominici F, Samet JM, Schwartz J, Dockery D, Cohen A. Exposure measurement error in timeseries studies of air pollution: concepts and consequences. Environ Health Perspect. 2000;108(5):419–26.
Acknowledgements
Not Applicable.
Funding
This work was supported by grants from the National Institute of Environmental Health Sciences (NIEHS) R01ES026578, R01ES026964, R21ES020811 and P30ES017885, and by the Center for Disease Control and Prevention (CDC)/National Institute for Occupational Safety and Health (NIOSH) grant T42OH008455 and National Science Foundation grant NSF DMS 1406712.
Availability of data and materials
The datasets generated and/or analyzed during the current study are available at https://github.com/ZhangchenZhao/EnvironmentalRiskScore.git.
Author information
Affiliations
Contributions
SKP and BM conceived and designed the study. ZZ performed the statistical analysis. SKP and BM interpreted the results. All authors were major contributors in writing the manuscript. All authors read and approved the final manuscript.
Corresponding author
Correspondence to Sung Kyun Park.
Ethics declarations
Ethics approval and consent to participate
NHANES is a publicly available data set and all participants in NHANES provide written informed consent, consistent with approval by the National Center for Health Statistics Institutional Review Board.
Consent for publication
Not Applicable.
Competing interests
The authors declare that they have no competing interests.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Additional file
Additional file 1: Table S1.
Supplemental tables and figures for the construction of environmental risk score beyond standard linear models using machine learning methods. (DOCX 1620 kb)
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
About this article
Cite this article
Park, S.K., Zhao, Z. & Mukherjee, B. Construction of environmental risk score beyond standard linear models using machine learning methods: application to metal mixtures, oxidative stress and cardiovascular disease in NHANES. Environ Health 16, 102 (2017). https://doi.org/10.1186/s1294001703109
Received:
Accepted:
Published:
Keywords
 Bayesian additive regression tree (BART)
 Bayesian kernel machine regression (BKMR)
 Cardiovascular disease
 Elasticnet
 Environmental risk score (ERS)
 Machine learning
 Metals
 Mixtures
 Multipollutants
 Super Learner
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate. Please note that comments may be removed without notice if they are flagged by another user or do not comply with our community guidelines.