Skip to main content

Table 3 Comparison of ERS distribution and risk prediction performance by different statistical approaches

From: Construction of environmental risk score beyond standard linear models using machine learning methods: application to metal mixtures, oxidative stress and cardiovascular disease in NHANES

  Base Modela AENET-M AENET-I BART BKMR SL Full Modelb
Distributions of ERS
 Training Set
  Mean (SD) 0.00 (0.05) 0.00 (0.06) 0.00 (0.06) 0.00 (0.27) 0.00 (0.06)
  Range (−0.18, 0.25) (−0.26, 0.49) (−0.22, 0.45) (−1.01, 1.83) (−0.18, 0.27)
 Testing Set
  Mean (SD) 0.00 (0.05) 0.00 (0.06) 0.00 (0.05) 0.01 (0.08) 0.00 (0.04)
  Range (−0.22, 0.19) (−0.22, 0.35) (−0.22, 0.32) (−0.34, 0.66) (−0.17, 0.24)
Risk Prediction Performance
 Continuous GGTc
  Training Set
   Correlationd 0.22 0.24 0.35 0.82 0.75
   MSE 7.2E-02 7.0E-02 6.9E-02 6.4E-02 2.6E-04 3.6E-02 6.7E-02
  Testing Set
   Correlationd 0.25 0.27 0.20 0.00 0.26
   PRESS 332.9 320.6 316.1 325.1 332.3 321.7 327.2
   MSPE 6.9E-02 6.6E-02 6.5E-02 6.7E-02 6.9E-02 6.6E-02 6.8E-02
 Dichotomous GGTe
  Training Set
   AUC 0.67 0.70 0.71* 0.75 >0.99 0.92 0.73
   95% CI (0.64, 0.69) (0.67, 0.72) (0.68, 0.73) (0.73, 0.78) (0.99, 1.00) (0.91, 0.93) (0.70, 0.75)
  Testing Set
   AUC 0.66 0.69 0.70* 0.69 0.66 0.70* 0.68
   95% CI (0.64, 0.68) (0.67, 0.71) (0.68, 0.72) (0.66, 0.71) (0.64, 0.68) (0.67, 0.72) (0.66, 0.70)
  1. AENET-M adaptive elastic net for main effects, AENET-I adaptive elastic net for main effects and pairwise interactions, BART Bayesian Additive Regression Tree, BKMR Bayesian Kernel Machine Regression, SL Super Learner, GGT gamma-glutamyl transferase, MSE mean square error, PRESS predicted residual sums of squares, MSPE mean square prediction error, AUC area under the receiver operating characteristic curve
  2. aBase model contains only covariates (age, sex, race/ethnicity, smoking status, education, body mass index, urinary creatinine)
  3. bFull model contains all covariates, main effects and all possible pairwise interactions of metals
  4. cGGT was logarithmically transformed. Mean (SD) of log(GGT) = 0.27 (0.21)
  5. dCorrelation between GGT and ERS
  6. eGGT was dichotomized at the 90th percentile (50 I/U)
  7. * P < 0.1, P < 0.05, P < 0.01. P-values were computed with permutation tests comparing with AUC of the base model