Skip to main content

Air toxics and birth defects: a Bayesian hierarchical approach to evaluate multiple pollutants and spina bifida



While there is evidence that maternal exposure to benzene is associated with spina bifida in offspring, to our knowledge there have been no assessments to evaluate the role of multiple hazardous air pollutants (HAPs) simultaneously on the risk of this relatively common birth defect. In the current study, we evaluated the association between maternal exposure to HAPs identified by the United States Environmental Protection Agency (U.S. EPA) and spina bifida in offspring using hierarchical Bayesian modeling that includes Stochastic Search Variable Selection (SSVS).


The Texas Birth Defects Registry provided data on spina bifida cases delivered between 1999 and 2004. The control group was a random sample of unaffected live births, frequency matched to cases on year of birth. Census tract-level estimates of annual HAP levels were obtained from the U.S. EPA’s 1999 Assessment System for Population Exposure Nationwide. Using the distribution among controls, exposure was categorized as high exposure (>95th percentile), medium exposure (5th-95th percentile), and low exposure (<5th percentile, reference). We used hierarchical Bayesian logistic regression models with SSVS to evaluate the association between HAPs and spina bifida by computing an odds ratio (OR) for each HAP using the posterior mean, and a 95% credible interval (CI) using the 2.5th and 97.5th quantiles of the posterior samples. Based on previous assessments, any pollutant with a Bayes factor greater than 1 was selected for inclusion in a final model.


Twenty-five HAPs were selected in the final analysis to represent “bins” of highly correlated HAPs (ρ > 0.80). We identified two out of 25 HAPs with a Bayes factor greater than 1: quinoline (ORhigh = 2.06, 95% CI: 1.11-3.87, Bayes factor = 1.01) and trichloroethylene (ORmedium = 2.00, 95% CI: 1.14-3.61, Bayes factor = 3.79).


Overall there is evidence that quinoline and trichloroethylene may be significant contributors to the risk of spina bifida. Additionally, the use of Bayesian hierarchical models with SSVS is an alternative approach in the evaluation of multiple environmental pollutants on disease risk. This approach can be easily extended to environmental exposures, where novel approaches are needed in the context of multi-pollutant modeling.

Peer Review reports


Birth defects affect approximately 6% of births worldwide [1]. In the United States (U.S.), birth defects are the leading cause of pediatric hospitalizations [2], medical expenditures [3], and death in the first year of life [4]. Neural tube defects (NTDs), one of the most common groups of birth defects, are complex malformations of the central nervous system that result from failure of neural tube closure [1]. One of the most common NTDs is spina bifida. Infants with spina bifida experience both increased morbidity and mortality compared to their unaffected contemporaries [5, 6]. Although these defects are clinically significant, little is known about their etiology. However, there is growing evidence that these conditions are associated with maternal exposure to environmental toxicants [7].

The U.S. Clean Air Act of 1990 classified 188 environmental toxicants as air toxics or hazardous air pollutants (HAPs). In 1999, the United States Environmental Protection Agency (U.S. EPA) went on to identify 33 HAPs that present the greatest threat to public health [8]. Included in this list are: aromatic solvents (e.g., benzene), chlorinated solvents (e.g., methylene chloride) and metals (e.g., nickel compounds). HAPs are a particularly important group of environmental toxicants because: 1) they are known or suspected to cause a range of adverse health outcomes [9]; 2) their levels are increasing in communities throughout the U.S. [1012]; and 3) there are currently no national air quality standards for HAPs, as there are for the criteria air pollutants (e.g., carbon monoxide and ozone) [13].

While there is evidence that maternal exposure to benzene is associated with spina bifida in offspring [14], to our knowledge there have been no assessments to evaluate the role of multiple HAPs simultaneously on the risk of spina bifida or other birth defects. When simultaneously evaluating multiple predictors for disease outcome, current methods focus on building multivariable models, rather than the evaluation of single exposures adjusting for known covariates [15]. Yet traditional stepwise methods for model selection using statistics computed at each step can lead to biased estimates [15]. Bayesian variable selection techniques, such as stochastic search methods, offer a solution to this problem. Specifically, stochastic search methods include model selection uncertainty in the model building process to provide more comprehensive information regarding important predictors [1618]. These stochastic search methods, also considered a Bayesian hierarchical mixture model, can jointly model multiple factors while including estimates of uncertainty to balance power and false discovery control [18, 19]. Specifically, simulations have shown that priors can be selected such that the evidence of a correct association is higher for stochastic search methods compared to stepwise regression methods when selecting a model [18]. Stochastic search methods also perform well in situations with correlated predictors (r2 ≈ 0.25-0.80) [1720]. As a result, stochastic search variable selection methods have been successfully employed when investigating complex diseases, especially when assessing multiple genetic predictors [21, 22]. In the current study, we evaluated the association between maternal exposure to the 33 HAPs identified by the U.S. EPA and spina bifida in offspring using hierarchical Bayesian modeling that includes stochastic search.


Study population

The study population has been described previously [14]. Briefly, data on live births, stillbirths, and electively terminated fetuses with NTDs (including spina bifida) delivered between January 1, 1999 and December 31, 2004 were obtained from the Texas Birth Defects Registry (n = 1,108). The registry is a population-based, active surveillance system that has monitored births, fetal deaths, and terminations throughout the state since 1999. A stratified random sample of unaffected live births delivered in Texas between January 1, 1999 and December 31, 2004 was selected as the control group using a ratio of 4 controls to 1 case. Controls were frequency matched to cases by year of birth due to the decreasing birth prevalence of NTDs over time [23]. This yielded a group of 4,132 controls. The study protocol was reviewed and approved by the Institutional Review Boards of the Texas Department of State Health Services, The University of Texas Health Science Center at Houston, and Baylor College of Medicine.

Exposure assessment

Census tract-level estimates of ambient HAP concentrations were obtained from the U.S. EPA’s 1999 Assessment System for Population Exposure Nationwide (ASPEN) [2426]. The methods used for ASPEN have been described fully elsewhere [25, 26]. Briefly, ASPEN is part of the National Air Toxic Assessment [12] and is based on the U.S. EPA’s Industrial Source Complex Long Term Model. It takes into account emissions data, rate, location, and height of pollutant release; meteorological conditions; and the reactive decay, deposition, and transformation of pollutants. Ambient air levels of HAPs are reported as annual concentrations in μg/m3[26]. Residential HAP levels were estimated based on maternal address at delivery as reported on vital records for cases and controls. Addresses were geocoded and mapped to their respective census tracts by the Texas Department of State Health Services. Our data included mothers from 2,381 census tracts.


The following covariates were selected a priori[14, 2731] as potential confounders and were obtained or calculated from vital records data: infant sex; year of birth; maternal race/ethnicity (non-Hispanic white, non-Hispanic black, Hispanic, or other); maternal birth place (U.S., Mexico, or other); maternal age (<20, 20–24, 25–29, 30–34, 35–39, or ≥40 years); maternal education (<high school, high school, or > high school); marital status (married or not married); parity (0, 1, 2, or ≥3); maternal smoking (no or yes); and season of conception (spring, summer, fall, or winter). Additionally, as the exposure assessment was based on census tract-level estimates, we opted to include a census tract-level estimate of socioeconomic status (percent of households below the poverty level), which was obtained from the U.S. Census 2000 Summary File 3. The percent of households in each census tract below the poverty level was categorized into quartiles (low, medium-low, medium-high, and high poverty level), based on the distribution among the controls.

Statistical analysis

Descriptive statistics included distributional characteristics of the 33 HAPs and frequency distributions for demographic variables stratified on case–control status. Differences in the distribution of categorical variables between cases and controls were determined using chi-squared tests where P < 0.05. Correlations between HAPs were determined using Spearman’s rank correlation. Because SVSS is most appropriate when variables are correlated (i.e., ρ = 0.25-0.80) but not highly correlated (i.e., ρ > 0.80), we grouped (or “binned”) HAPs with high correlation (i.e., ρ > 0.80) and selected pollutants to represent a given bin based on existing science and correlations with the other HAPs within the bin. Omitting variables based on correlation and existing scientific evidence is a reasonable approach to reduce multicollinearity [32]. To bin the HAPs, we used an algorithm based on correlation that is commonly used in genetic association studies [33]. Once the bins were defined, we selected HAPs within the bins to either best represent the bin (maximum correlation with other HAPS), or a combination of highest correlation and existing evidence of association with birth defects.

Two primary association analyses were conducted. First, we examined the association between maternal exposure to each HAP individually and spina bifida in offspring, adjusting for year of birth, maternal education, maternal race/ethnicity, maternal smoking, and census tract poverty status [14] using Bayesian hierarchical logistic regression. Second, we performed a multi-pollutant analysis using Bayesian hierarchical logistic regression combined with Stochastic Search Variable Selection (SSVS) to jointly investigate all HAPs while adjusting for the same covariates. The Bayesian hierarchical model can be interpreted as a mixed-effects logistic model as it provides a fixed-effect for the association between maternal exposure to each HAP and spina bifida in offspring, as well as a random intercept to account for the within-group correlation resulting from the use of a census tract-level exposure assignment [34]. SSVS adds a coherent data driven probabilistic framework to search through the fixed effects and identify potentially important associations [1618, 35]. Additionally, the simultaneous inference of multiple HAPs in a Bayesian framework is not affected by multiple comparisons in the same way as in a frequentist framework, and can easily be accommodated using an appropriate prior [36, 37]. For a comparison to SSVS, we also performed a standard hierarchical Bayesian model without selection, assessing the HAPs simultaneously.

We categorized the HAPs into three categories based on their distribution among the controls: high exposure (above the 95th percentile of controls), medium exposure (between the 5th percentile to 95th percentile of controls), and low exposure (below the 5th percentile, used as the reference) [14, 38]. As a sensitivity analysis to our categorization, the Bayesian analysis was repeated modeling each HAP as a continuous measure. Due to the large variation in concentrations across all HAPs, we centered and standardized each pollutant.

Prior Distributions: For the individual pollutant analysis, we used a hierarchical prior for the random intercept based on previously established methods [34]. Specifically, the random intercept was given a normal prior distribution with mean of 0, and the standard deviation component for the random intercept was given a uniform hyper prior on the range of 0 to 3. The priors for the covariate fixed effects were normally distributed with a mean of 0 and variance of 10. In the context of logistic regression, these priors are considered non-informative. In the multi-pollutant model with SSVS, we used the same priors for the random and fixed parameters for the covariates. For the parameters corresponding to the HAPs, we assumed a mixture prior for SSVS [18]. This mixture involves a normal distribution with mean 0 and variance 0.001 if the variable was not selected, and a normal prior with mean 0 and variance 10 if the variable was selected. Through selection of 0.001 as the variance for the prior when the variable was not selected, the null OR is defined as being in the interval from 0.97 to 1.03 with a 99% probability. In other words, we consider an OR in this interval to not be meaningfully different from a null association [19]. We set the prior probability of inclusion for each variable to 0.25, and sensitivity analyses were conducted using a prior probability of inclusion of 0.50. These settings for prior probabilities for inclusion have been shown to have a good balance between power and false positives [18]. Each covariate (including those that were categorized) had an independent mixture prior.

Model Estimation and Selection: We estimated the posterior distributions of the hierarchical Bayesian models using Markov chain Monte Carlo (MCMC) methods. For the single pollutant analyses, we computed an OR for each pollutant using the posterior mean, and a 95% credible interval (CI) using the 2.5th and 97.5th quantiles of the posterior samples [39]. For the multi-pollutant model using SSVS, we computed the marginal Bayes factor for each pollutant [40]. In brief, the marginal Bayes factor is a ratio of the prior odds to the posterior odds that summarizes the evidence for selection of each variable, given the data. Therefore, any Bayes factor greater than 1 implies some evidence for inclusion in the model, with values much greater than 1 representing stronger evidence [41, 42]. We included those HAPs with a marginal Bayes factor greater than 1 in a joint model, computed the joint posterior distribution of the selected model through MCMC, and computed the OR and 95% CI for the joint model using the posterior mean and quantiles of the beta coefficients as described. For categorical covariates, if either high or medium were selected, we considered both as selected for final model estimation. For all MCMC computations, we simulated two chains with separate initial values, each consisting of 150,000 iterations. We discarded the first 50% of each chain as burn in to allow the chain to converge to the posterior distribution. We assessed convergence through how well the posterior means for all parameters correlated between the two chains. We considered a correlation higher than 0.95 indicating that the chains had sufficiently converged. Once convergence was determined to be adequate, we pooled the retained iterations to compute our estimates of the OR and 95% CI. All MCMC computations were performed using WinBUGS 1.4 [43], and posterior inference was performed using R (64 bit v. 3.0.2).


The distributional characteristics of 32 of the 33 U.S. EPA-designated HAPs based on the 1999 ASPEN model are presented in Table 1. Coke oven emissions, which were included in the list of 33, were not estimated for Texas in the 1999 ASPEN model. There were four groups of highly correlated HAPs (Table 2). As noted, we identified one or two HAPs from each group to represent that “bin” of HAPs based on selection criteria used in genetic association studies for highly correlated single nucleotide polymorphisms [33]. Specifically, benzene and methylene chloride were selected to represent the highly correlated group consisting of acetaldehyde, acrolein, and formaldehyde, benzene and methylene chloride. 1,1,2,2-Tetrachloroethane was selected to represent the highly correlated group including ethylene dibromide, propylene dichloride, and 1,1,2,2-tetrachloroethane. Vinyl chloride was selected to represent the highly correlated group including ethylene dichloride and vinyl chloride. Diesel particulate matter was selected to represent the highly correlated group consisting of diesel particulate matter and nickel compounds. After applying these criteria, 25 HAPs remained in our analysis.

Table 1 Distributional characteristics of hazardous air pollutants (μg/m 3 ) based on the 1999 U.S. EPA ASPEN Model, Texas
Table 2 Hazardous air pollutants with correlations greater thanρ> 0.80, Texas, 1999

To minimize etiologic heterogeneity within the case group, cases with an associated chromosomal abnormality or other syndrome (n = 75), those with a closed defect (i.e., lipomyelomeningocele, n = 88), and those with anencephaly (n = 351) were excluded. Cases with missing geocoded maternal address were excluded (n = 61). After these exclusions, 533 spina bifida cases were available for analysis. Of the 4,132 controls, 437 were excluded due to missing geocoded maternal address. The final control group consisted of 3,695 unaffected births for analysis. The proportion of case and control mothers missing address information was similar (11.4% and 10.5%, respectively), and there were no significant differences on demographic factors between those with and without a maternal address at delivery. The characteristics of cases and controls are presented in Table 3. Mothers of spina bifida cases were more likely to be Hispanic and to have been born in Mexico compared to mothers of controls (p = 0.003 and p = 0.05, respectively). Additionally, mothers of cases were more likely to live in census tracts with higher poverty levels (p = 0.02). Cases and controls did not significantly differ on other demographic characteristics.

Table 3 Characteristics of spina bifida cases and controls, Texas, 1999-2004

When evaluating the association between the 32 HAPs and spina bifida in single-pollutant models, 14 of the 32 (44%) had 95% CIs excluding 1.0 for either the medium or high exposure categories (Additional file 1: Table S1). Based on the multi-pollutant analysis among the 25 HAPs, when computing the marginal Bayes factors from the SSVS posterior, we identified two with a Bayes factor greater than or 1 (Table 4): quinoline (ORhigh = 2.06, 95% CI: 1.11-3.87, Bayes factor = 1.01); and trichloroethylene (ORmedium = 2.00, 95% CI: 1.14-3.61, Bayes factor = 3.79). These associations are stronger than those of the covariates (Additional file 1: Table S2), while the 95% CIs are of similar width. The unadjusted ORs overestimate these effects due to uncontrolled confounding (Additional file 1: Table S3). For comparison, the joint model without SSVS only identified the medium level of Trichloroethylene as associated with spina bifida (OR = 5.72, 95% CI: 1.44-24.16, Additional file 1: Table S4). The sensitivity analysis using a prior probability of 0.50 yielded similar results (data not shown). Our analysis using HAPs on the continuous scale selected eight HAPs with a Bayes factor greater than 1, however, all of the 95% CIs included 1.0 (data not shown).

Table 4 Hazardous air pollutants associated with spina bifida identified using Stochastic Search Variable Selection (SSVS) with a Bayes factor greater than 1.00


To our knowledge, this is the first application of a Bayesian variable selection strategy to evaluate the role of multiple HAPs simultaneously on the risk of birth defects. Overall there is evidence that HAPs may be a significant contributor to the risk of spina bifida. Specifically, in single-pollutant models, a large proportion of HAPs (44%) were positively associated with spina bifida. Additionally, using a Bayesian hierarchical approach with SSVS as a multi-pollutant model, we found two HAPs that were associated with spina bifida: quinoline and trichloroethylene. Mothers who lived in census tracts with high quinolone levels or medium trichloroethylene levels were approximately two times as likely to have a child with spina bifida compared to mothers who lived in census tracts with relatively low levels. The effect estimate for mothers living in census tracts with high levels of trichloroethylene was smaller (OR = 1.32) in comparison to the effect estimate for medium levels (OR = 2.00). This inverted U-shaped dose–response relationship is common among toxicants that act as endocrine disruptors such as trichloroethylene [4447].

The mechanism by which HAPs may lead to teratogenesis is unknown. However, certain HAPs (e.g., benzene, polycyclic aromatic hydrocarbons) are known to cross the placenta and have been found in cord blood at levels equal to or higher than maternal blood [48]. Potential mechanisms by which HAPs may influence the risk of spina bifida include genetic toxicity and oxidative stress. In fact, these mechanisms may interact to contribute to teratogenesis. Specifically, certain HAPs (e.g., polycyclic aromatic hydrocarbons) can lead to genetic toxicity by covalently binding to DNA. The resulting DNA adducts, if not repaired, are mutagenic, resulting in the disruption of the cell’s microenvironment, which leads to inhibition of important enzymes, cell death, and alteration of other cells [49]. If this occurs during the critical window of embryonic development, the complex cellular processes involved in development may be disturbed, leading to spina bifida. Several HAPs (e.g., benzene, toluene) can also form free radicals known as reactive oxygen species (ROS) [9], which may lead to oxidative stress. These ROS can cause DNA strand breakage or fragmentation leading to cell mutation [49]. The importance of oxidative stress as a mechanism of teratogenesis is suggested by several animal studies [5055].

Quinoline is a coal tar constituent and is the major tar base in creosote [56]. In mouse models, maternal exposure to quinoline has been shown to induce skeletal and visceral malformations in offspring [57]. Other studies indicate quinoline may cross the placenta into the tissue of the developing fetus [56]. However, to our knowledge, there have been no studies evaluating the association between human maternal exposure to quinoline and the risk of spina bifida or other birth defects, suggesting more work is needed on the potential teratogenicity of this agent.

Most of the trichloroethylene used in the U.S. is released into the atmosphere from industrial degreasing operations [58]. While there is evidence from both animal and human studies that trichloroethylene is associated with birth defects, specifically congenital heart defects [5961], there is ongoing debate over the teratogenicity of this pollutant [62]. An evaluation using data from Camp Lejeune, North Carolina indicated that mothers exposed to higher levels of trichloroethylene were 2.4 times (95% CI: 0.6-9.6) as likely to have offspring with NTDs compared to those exposed to lower levels [63]. While this association was not statistically significant, the strength of the association was similar to that in our assessment.

While maternal exposure to benzene was associated with spina bifida in the single-pollutant model, it was not selected as a final variable in the multi-pollutant model. The effect estimate for benzene from the single-pollutant Bayesian model for the highly exposed group (OR = 1.99) was similar to that from the previous assessment (OR = 2.30) [14]. The absence of benzene from the final model may be due to multiple factors including: 1) high correlation (ρ > 0.80) with several other HAPs and 2) the estimate of effect was not as strong as other HAPs in the final multi-pollutant model.

Our study must be considered in the light of certain limitations. One potential limitation is the use of modeled predictions of ambient air concentrations of HAPs (i.e., the ASPEN model), which may have resulted in exposure misclassification. However, there is no data source that sufficiently addresses this issue. For instance, personal monitoring is not available on a population scale, and outdoor monitoring in Texas is restricted to certain communities. Therefore, the use of ASPEN data is a cost-effective approach in assessing this important question. An additional potential limitation is the use of ASPEN data from 1999 for the entire study period. It is not recommended by the EPA to include ASPEN data from multiple years simultaneously in one assessment. However, relying on HAP estimates from 1999 alone may be a suitable surrogate for other years as while levels of HAPs are likely to change over time, the relative ranking of census tracts based on ambient levels of HAPs was unlikely to change during the study period [10, 64, 65]. Additionally, ASPEN data have been used in several population-based assessments of adverse health outcomes, including birth defects [14, 6568]. Lastly, information on address at conception was unavailable, and, therefore, we were limited to basing the exposure assignment on maternal address at delivery. However, our previous work suggests that census tract-level exposure assessment is not significantly different when assessing HAP exposure using ASPEN data between the time of conception and delivery [69].

Another potential limitation is the use of an area-based (census tract-level) measure of HAP exposure. Using area-based measures of exposure always assumes some level of increased exposure misclassification, especially compared to individual-level measures. However, using census-tract level exposure information, as was used in this assessment, lessens the amount of potential exposure misclassification compared to using county-level information, which is commonly used in epidemiologic studies of environmental exposures [47, 7072]. In addition, it is possible that the amount of exposure misclassification could vary by each HAP included in this assessment, potentially introducing additional exposure measurement error with a complex correlation structure, all of which could bias the effect estimates towards the null.

Another point to discuss is the correlation among HAPs in our data. We chose a correlation binning technique commonly employed in statistical genetics to reduce the correlation of HAPS to be below 0.80 while representing as much of the association information as possible in the HAPs for the model [33]. Other methods could have been used, such as binning pollutants on chemical properties, or perhaps targeted source. However, since the statistical model assesses association through correlations, defining bins of HAPs based on correlation seems to be most congruent with the statistical modeling approach. We added a scientific component in our dimension reduction. Instead of purely using statistics to define the representative of each bin, we chose the representative HAPs for a bin based on 1) previously reported associations as well as 2) the level of correlation of each HAP with the other HAPs in the bin. This approach allowed for the combination of previous scientific evidence as well as statistics to represent each bin.

Previous evidence suggests that SSVS may be prone to favor false negative results [18, 19]. Using a Bayes factor threshold of greater than 1 usually reduces the number of false negatives; however, even in the case of increased false negatives, on average, SSVS methods are more likely to generate correct associations compared to standard selection methods [18].

Strengths of this study include the use of a population-based birth defects registry that employs an active surveillance system to ascertain cases throughout the state of Texas. This should limit the potential for selection bias. Furthermore, the Texas Birth Defects Registry includes information on pregnancy terminations reducing potential bias due to the exclusion of these cases. An additional strength was the use of a relatively small (census tract-level) measure of exposure. Using larger geographic units to estimate exposure (e.g., counties) may not capture the spatial variability of HAPs [73].

An important aspect of this study was the Bayesian hierarchical approach for evaluating multiple pollutants while also accounting for the within-group correlation resulting from the use of a census tract-level exposure assignment through the random intercept [34]. Traditional models based on variable selection in a stepwise approach can lead to biased estimates [15]. Bayesian variable selection techniques (e.g., SVSS) offer an attractive alternative to multi-pollutant modeling. Specifically, SSVS includes model selection uncertainty in the model building process to provide more comprehensive information regarding important predictors [1618]. In our assessment, the Bayesian hierarchical approach resulted in the selection of two HAPs in the final multivariable model; however, when modeling the association with spina bifida in the single-pollutant models, we detected 14 HAPs with statistically significant associations with spina bifida, some of which may be false positives. When compared with the traditional single pollutant models, the multivariable model reduces the number of detected pollutants from 14 to two.

In conclusion, we believe the use of Bayesian hierarchical models with SSVS provides a robust alternative in the evaluation of multiple environmental pollutants on disease risk as this approach allows the joint assessment of multiple factors while including estimates of uncertainty to balance power and false discovery control [18]. Bayesian methods have been reported to outperform conventional maximum-likelihood-estimation techniques for prediction and are useful in settings where multiple exposures are evaluated [36, 37]. Additionally, concerns about multiple comparisons can be eliminated in the simultaneous assessment of multiple HAPs within a Bayesian framework [36, 37]. Specifically, SSVS type methods may be prone to favoring false negatives [18, 19] (SIM and Devocht), meaning that false positives due to multiple comparisions are not an issue. This approach has been used successfully when assessing the role of multiple genetic variants on complex diseases [18, 21, 22, 74], and can be easily extended to environmental exposures, where novel approaches are needed in the context of multi-pollutant modeling.



Assessment System for Population Exposure Nationwide


Credible interval


Hazardous air pollutants


Odds ratio


Neural tube defects


Stochastic Search Variable Selection


United States Environmental Protection Agency.


  1. 1.

    Christianson A, Howson CP, Modell B: Global report on birth defects.March of Dimes 2006, 98.

    Google Scholar 

  2. 2.

    Khoury AJ, Summers L, Weisman CS: Characteristics of current hospital-sponsored and nonhospital birth centers.Matern Child Health J 1997,1(2):89–99. 10.1023/A:1026270306793

    CAS  Article  Google Scholar 

  3. 3.

    Waitzman NJ, Romano PS, Scheffler RM: Estimates of the economic costs of birth defects.Inquiry 1994,31(2):188–205.

    CAS  Google Scholar 

  4. 4.

    Detrait ER, George TM, Etchevers HC, Gilbert JR, Vekemans M, Speer MC: Human neural tube defects: developmental biology, epidemiology, and genetics.Neurotoxicol Teratol 2005,27(3):515–24. 10.1016/

    CAS  Article  Google Scholar 

  5. 5.

    Wong LY, Paulozzi LJ: Survival of infants with spina bifida: a population study, 1979–94.Paediatr Perinat Epidemiol 2001,15(4):374–8. 10.1046/j.1365-3016.2001.00371.x

    CAS  Article  Google Scholar 

  6. 6.

    Mitchell LE, Adzick NS, Melchionne J, Pasquariello PS, Sutton LN, Whitehead AS: Spina bifida.Lancet 2004,364(9448):1885–95. 10.1016/S0140-6736(04)17445-X

    Article  Google Scholar 

  7. 7.

    Lupo PJ, Etheredge AJ, Agopian AJ, Mitchell LE: Epidemiology of neural tube defects. In Perinatal Epidemiology. Edited by: Shieiner E. Hauppauge, NY: Nova Science Publishing; 2010:411–38.

    Google Scholar 

  8. 8.

    List of the 33 Urban Air Toxics.

  9. 9.

    About Air Toxics.

  10. 10.

    Sexton K, Linder SH, Marko D, Bethel H, Lupo PJ: Comparative assessment of air pollution-related health risks in Houston.Environ Health Perspect 2007,115(10):1388–93.

    Google Scholar 

  11. 11.

    Lupo PJ, Symanski E: A comparative analysis of modeled and monitored ambient hazardous air pollutants in Texas: a novel approach using concordance correlation.J Air Waste Manag Assoc 2009,59(11):1278–86. 10.3155/1047-3289.59.11.1278

    CAS  Article  Google Scholar 

  12. 12.

    Ozkaynak H, Palma T, Touma JS, Thurman J: Modeling population exposures to outdoor sources of hazardous air pollutants.J Expo Sci Environ Epidemiol 2008,18(1):45–58. 10.1038/sj.jes.7500612

    Article  Google Scholar 

  13. 13.

    2002 National-Scale Air Toxics Assessment. []

  14. 14.

    Lupo PJ, Symanski E, Waller DK, Chan W, Langlois PH, Canfield MA, et al.: Maternal exposure to ambient levels of benzene and neural tube defects among offspring: Texas, 1999–2004.Environ Health Perspect 2011,119(3):397–402.

    Article  Google Scholar 

  15. 15.

    Rothman KJ, Greenland S, Lash TL: Modern epidemiology. 3rd edition. Philadelphia: Wolters Kluwer Health/Lippincott Williams & Wilkins; 2008.

    Google Scholar 

  16. 16.

    George EI: The variable selection problem.J Am Stat Assoc 2000,95(452):1304–8. 10.1080/01621459.2000.10474336

    Article  Google Scholar 

  17. 17.

    George EI, McCulloch RE: Variable selection via gibbs sampling.J Am Stat Assoc 1993,88(423):881–9. 10.1080/01621459.1993.10476353

    Article  Google Scholar 

  18. 18.

    Swartz MD, Yu RK, Shete S: Finding factors influencing risk: Comparing Bayesian stochastic search and standard variable selection methods applied to logistic regression models of cases and controls.Stat Med 2008,27(29):6158–74. 10.1002/sim.3434

    Article  Google Scholar 

  19. 19.

    de Vocht F, Cherry N, Wakefield J: A Bayesian mixture modeling approach for assessing the effects of correlated exposures in case–control studies.J Expo Sci Environ Epidemiol 2012,22(4):352–60. 10.1038/jes.2012.22

    Article  Google Scholar 

  20. 20.

    Swartz MD, Kimmel M, Mueller P, Amos CI: Stochastic search gene suggestion: a Bayesian Hierarchical model for gene mapping.Biometrics 2006,62(2):495–503. 10.1111/j.1541-0420.2005.00451.x

    Article  Google Scholar 

  21. 21.

    Fridley BL: Bayesian variable and model selection methods for genetic association studies.Genet Epidemiol 2009,33(1):27–37. 10.1002/gepi.20353

    Article  Google Scholar 

  22. 22.

    Swartz MD, Thomas DC, Daw EW, Albers K, Charlesworth JC, Dyer TC, et al.: Model selection and Bayesian methods in statistical genetics: summary of group 11 contributions to Genetic Analysis Workshop 15.Genet Epidemiol 2007,31(Suppl 1):S96–102.

    Article  Google Scholar 

  23. 23.

    Canfield MA, Marengo L, Ramadhani TA, Suarez L, Brender JD, Scheuerle A: The prevalence and predictors of anencephaly and spina bifida in Texas.Paediatr Perinat Epidemiol 2009,23(1):41–50. 10.1111/j.1365-3016.2008.00975.x

    Article  Google Scholar 

  24. 24.

    U.S. Environmental Protection Agency: 1999 National-Scale Air Toxics Assessment: 1999 Data Tables.

  25. 25.

    Rosenbaum AS, Axelrad DA, Woodruff TJ, Wei YH, Ligocki MP, Cohen JP: National estimates of outdoor air toxics concentrations.J Air Waste Manag Assoc 1999,49(10):1138–52. 10.1080/10473289.1999.10463919

    CAS  Article  Google Scholar 

  26. 26.

    The ASPEN Model.

  27. 27.

    Hertz-Picciotto I, Jusko TA, Willman EJ, Baker RJ, Keller JA, Teplin SW, et al.: A cohort study of in utero polychlorinated biphenyl (PCB) exposures in relation to secondary sex ratio.Environ Health 2008, 7:37. 10.1186/1476-069X-7-37

    Article  Google Scholar 

  28. 28.

    Linder SH, Marko D, Sexton K: Cumulative cancer risk from air pollution in Houston: disparities in risk burden and social disadvantage.Environ Sci Technol 2008,42(12):4312–22. 10.1021/es072042u

    CAS  Article  Google Scholar 

  29. 29.

    Woodruff TJ, Parker JD, Kyle AD, Schoendorf KC: Disparities in exposure to air pollution during pregnancy.Environ Health Perspect 2003,111(7):942–6. 10.1289/ehp.5317

    Article  Google Scholar 

  30. 30.

    Zhou Y, Levy JI: Factors influencing the spatial extent of mobile source air pollution impacts: a meta-analysis.BMC Public Health 2007, 7:89. 10.1186/1471-2458-7-89

    Article  Google Scholar 

  31. 31.

    Zou B, Peng F, Wan N, Mamady K, Wilson GJ: Spatial cluster detection of air pollution exposure inequities across the United States.PLoS One 2014,9(3):e91917. 10.1371/journal.pone.0091917

    Article  Google Scholar 

  32. 32.

    Kutner M, Nachtsheim C, Netter J, Li W: Applied Linear Statistical Models. 5th edition. Boston, MA: McGraw-Hill; 2004.

    Google Scholar 

  33. 33.

    Carlson CS, Eberle MA, Rieder MJ, Yi Q, Kruglyak L, Nickerson DA: Selecting a maximally informative set of single-nucleotide polymorphisms for association analyses using linkage disequilibrium.Am J Hum Genet 2004,74(1):106–20. 10.1086/381000

    CAS  Article  Google Scholar 

  34. 34.

    Gelman A, Hill J: Data analysis using regression and multilevel/hierarchical models. Cambridge: New York: Cambridge University Press; 2007.

    Google Scholar 

  35. 35.

    George EI, McCulloch RE: Approaches for Bayesian variable selection.Stat Sin 1997,7(2):339–73.

    Google Scholar 

  36. 36.

    Kalkbrenner AE, Daniels JL, Chen JC, Poole C, Emch M, Morrissey J: Perinatal exposure to hazardous air pollutants and autism spectrum disorders at age 8.Epidemiology 2010,21(5):631–41. 10.1097/EDE.0b013e3181e65d76

    Article  Google Scholar 

  37. 37.

    Greenland S: Methods for epidemiologic analyses of multiple exposures: a review and comparative study of maximum-likelihood, preliminary-testing, and empirical-Bayes regression.Stat Med 1993,12(8):717–36. 10.1002/sim.4780120802

    CAS  Article  Google Scholar 

  38. 38.

    Lupo PJ, Lee LJ, Okcu MF, Bondy ML, Scheurer ME: An exploratory case-only analysis of gene-hazardous air pollutant interactions and the risk of childhood medulloblastoma.Pediatr Blood Cancer 2012,59(4):605–10. 10.1002/pbc.24105

    Article  Google Scholar 

  39. 39.

    Gelman A, Carlin JB, Stern HS, Rubin DB: Bayesian data analysis. 2nd edition. Boca Raton, Fla: Chapman & Hall/CRC; 2009.

    Google Scholar 

  40. 40.

    Swartz MD, Peterson CB, Lupo PJ, Wu X, Forman MR, Spitz MR, et al.: Investigating multiple candidate genes and nutrients in the folate metabolism pathway to detect genetic and nutritional risk factors for lung cancer.Plos one 2013,8(1):e53475. 10.1371/journal.pone.0053475

    CAS  Article  Google Scholar 

  41. 41.

    Kass RE, Raftery AE: Bayes factors.J Am Stat Assoc 1995,90(430):773–95. 10.1080/01621459.1995.10476572

    Article  Google Scholar 

  42. 42.

    Dey T, Ishwaran H, Rao JS: An in-depth look at highest posterior model selection.Econometric Theory 2008, 24:377–403.

    Article  Google Scholar 

  43. 43.

    Lunn DJ, Thomas A, Best N, Spiegelhalter D: WinBUGS – a Bayesian modelling framework: concepts, structure, and extensibility.Stat Comput 2000, 10:325–37. 10.1023/A:1008929526011

    Article  Google Scholar 

  44. 44.

    Diamanti-Kandarakis E, Bourguignon JP, Giudice LC, Hauser R, Prins GS, Soto AM, et al.: Endocrine-disrupting chemicals: an Endocrine Society scientific statement.Endocr Rev 2009,30(4):293–342. 10.1210/er.2009-0002

    CAS  Article  Google Scholar 

  45. 45.

    vom Saal FS, Akingbemi BT, Belcher SM, Birnbaum LS, Crain DA, Eriksen M, et al.: Chapel Hill bisphenol A expert panel consensus statement: integration of mechanisms, effects in animals and potential to impact human health at current levels of exposure.Reprod Toxicol 2007,24(2):131–8. 10.1016/j.reprotox.2007.07.005

    CAS  Article  Google Scholar 

  46. 46.

    Endocrine Disruptor Screening Program (EDSP): Revised Second List of Chemicals for Tier 1 Screening.

  47. 47.

    Agopian AJ, Langlois PH, Cai Y, Canfield MA, Lupo PJ: Maternal residential atrazine exposure and gastroschisis by maternal age.Matern Child Health J 2013,17(10):1768–75. 10.1007/s10995-012-1196-3

    CAS  Article  Google Scholar 

  48. 48.

    Toxicological Profile of Benzene.

  49. 49.

    Casarett L, Doull J: Casarett & Doull's Toxicology: The Basic Science of Poisons. 6th edition. New York, NY: McGraw-Hill; 2001.

    Google Scholar 

  50. 50.

    Liu L, Wells PG: DNA oxidation as a potential molecular mechanism mediating drug-induced birth defects: phenytoin and structurally related teratogens initiate the formation of 8-hydroxy-2’-deoxyguanosine in vitro and in vivo in murine maternal hepatic and embryonic tissues.Free Radic Biol Med 1995,19(5):639–48. 10.1016/0891-5849(95)00082-9

    CAS  Article  Google Scholar 

  51. 51.

    Parman T, Wiley MJ, Wells PG: Free radical-mediated oxidative DNA damage in the mechanism of thalidomide teratogenicity.Nat Med 1999,5(5):582–5. 10.1038/8466

    CAS  Article  Google Scholar 

  52. 52.

    Wells PG, Kim PM, Laposa RR, Nicol CJ, Parman T, Winn LM: Oxidative damage in chemical teratogenesis.Mutat Res 1997,396(1–2):65–78.

    CAS  Article  Google Scholar 

  53. 53.

    Fantel AG: Reactive oxygen species in developmental toxicity: review and hypothesis.Teratology 1996,53(3):196–217. Publisher Full Text 10.1002/(SICI)1096-9926(199603)53:3<196::AID-TERA7>3.0.CO;2-2

    CAS  Article  Google Scholar 

  54. 54.

    Morriss GM, New DA: Effect of oxygen concentration on morphogenesis of cranial neural folds and neural crest in cultured rat embryos.J Embryol Exp Morphol 1979, 54:17–35.

    CAS  Google Scholar 

  55. 55.

    Hansen JM: Oxidative stress as a mechanism of teratogenesis.Birth Defects Res C Embryol Today 2006,78(4):293–307. 10.1002/bdrc.20085

    CAS  Article  Google Scholar 

  56. 56.

    Toxicological Profile for Wood Creosote, Coal Tar Creosote, Coal Tar, Coal Tar Pitch, and Coal Tar Pitch Volatiles.

  57. 57.

    Iyer PR, Irvin TR, Martin JE: Developmental effects of petroleum creosote on mice following oral exposure.Res Commun Chem Pathol Pharmacol 1993,82(3):371–4.

    CAS  Google Scholar 

  58. 58.

    Trichloroethylene Hazard Summary.

  59. 59.

    Carney EW, Thorsrud BA, Dugard PH, Zablotny CL: Developmental toxicity studies in Crl:CD (SD) rats following inhalation exposure to trichloroethylene and perchloroethylene.Birth Defects Res B Dev Reprod Toxicol 2006,77(5):405–12. 10.1002/bdrb.20091

    CAS  Article  Google Scholar 

  60. 60.

    Rufer ES, Hacker TA, Flentke GR, Drake VJ, Brody MJ, Lough J, et al.: Altered cardiac function and ventricular septal defect in avian embryos exposed to low-dose trichloroethylene.Toxicol Sci 2010,113(2):444–52. 10.1093/toxsci/kfp269

    CAS  Article  Google Scholar 

  61. 61.

    Yauck JS, Malloy ME, Blair K, Simpson PM, McCarver DG: Proximity of residence to trichloroethylene-emitting sites and increased risk of offspring congenital heart defects among older women.Birth Defects Res A Clin Mol Teratol 2004,70(10):808–14. 10.1002/bdra.20060

    CAS  Article  Google Scholar 

  62. 62.

    Bukowski J: Critical review of the epidemiologic literature regarding the association between congenital heart defects and exposure to trichloroethylene.Crit Rev Toxicol 2014,44(7):581–9. 10.3109/10408444.2014.910755

    CAS  Article  Google Scholar 

  63. 63.

    Ruckart PZ, Bove FJ, Maslia M: Evaluation of exposure to contaminated drinking water and specific birth defects and childhood cancers at Marine Corps Base Camp Lejeune, North Carolina: a case–control study.Environ Health 2013, 12:104. 10.1186/1476-069X-12-104

    Article  Google Scholar 

  64. 64.

    Grant RL, Leopold V, McCant D, Honeycutt M: Spatial and temporal trend evaluation of ambient concentrations of 1,3-butadiene and chloroprene in Texas.Chem Biol Interact 2007,166(1–3):44–51.

    CAS  Article  Google Scholar 

  65. 65.

    Whitworth KW, Symanski E, Coker AL: Childhood lymphohematopoietic cancer incidence and hazardous air pollutants in southeast Texas, 1995–2004.Environ Health Perspect 2008,116(11):1576–80. 10.1289/ehp.11593

    Article  Google Scholar 

  66. 66.

    Reynolds P, Von Behren J, Gunier RB, Goldberg DE, Hertz A, Smith DF: Childhood cancer incidence rates and hazardous air pollutants in California: an exploratory analysis.Environ Health Perspect 2003,111(4):663–8.

    CAS  Article  Google Scholar 

  67. 67.

    Windham GC, Zhang L, Gunier R, Croen LA, Grether JK: Autism spectrum disorders in relation to distribution of hazardous air pollutants in the san francisco bay area.Environ Health Perspect 2006,114(9):1438–44. 10.1289/ehp.9120

    CAS  Article  Google Scholar 

  68. 68.

    Ramakrishnan A, Lupo PJ, Agopian AJ, Linder SH, Stock TH, Langlois PH, et al.: Evaluating the effects of maternal exposure to benzene, toluene, ethyl benzene, and xylene on oral clefts among offspring in Texas: 1999–2008.Birth Defects Res A Clin Mol Teratol 2013,97(8):532–7. 10.1002/bdra.23139

    CAS  Article  Google Scholar 

  69. 69.

    Lupo PJ, Symanski E, Chan W, Mitchell LE, Waller DK, Canfield MA, et al.: Differences in exposure assignment between conception and delivery: the impact of maternal mobility.Paediatr Perinat Epidemiol 2010,24(2):200–8. 10.1111/j.1365-3016.2010.01096.x

    Article  Google Scholar 

  70. 70.

    Agopian AJ, Lupo PJ, Canfield MA, Langlois PH: Case–control study of maternal residential atrazine exposure and male genital malformations.Am J Med Genet A 2013,161A(5):977–82.

    CAS  Article  Google Scholar 

  71. 71.

    Basu R, Ostro BD: A multicounty analysis identifying the populations vulnerable to mortality associated with high ambient temperature in California.Am J Epidemiol 2008,168(6):632–7. 10.1093/aje/kwn170

    Article  Google Scholar 

  72. 72.

    Pagaoa MA, Okcu MF, Bondy ML, Scheurer ME: Associations between vaccination and childhood cancers in Texas regions.J Pediatr 2011,158(6):996–1002. 10.1016/j.jpeds.2010.11.054

    Article  Google Scholar 

  73. 73.

    Pratt GC, Wu CY, Bock D, Adgate JL, Ramachandran G, Stock TH, et al.: Comparing air dispersion model predictions with measured concentrations of VOCs in urban communities.Environ Sci Technol 2004,38(7):1949–59. 10.1021/es030638l

    CAS  Article  Google Scholar 

  74. 74.

    Cao Y, Lupo PJ, Swartz MD, Nousome D, Scheurer ME: Using a Bayesian hierarchical model for identifying single nucleotide polymorphisms associated with childhood acute lymphoblastic leukemia risk in case-parent triads.PLoS One 2013,8(12):e84658. 10.1371/journal.pone.0084658

    Article  Google Scholar 

Download references


This work was supported by the National Institute of Dental and Craniofacial Research (R03 DE02173901 awarded to PJL). This work was also supported by cooperative agreement U01DD000494 from the Centers for Disease Control to the Texas Department of State Health Services.

Author information



Corresponding author

Correspondence to Philip J Lupo.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors’ contributions

PJL, ES, and LEM conceived and designed the study. MDS, YC, and WC carried out the primary data analysis with the assistance of PJL. MDS, YC, and PJL drafted the initial manuscript. All authors contributed to revisions of the manuscript and approved the final manuscript.

Electronic supplementary material

Table S2.

Additional file 1: Table S1: Associations between estimated hazardous air pollutants and spina bifida using single-pollutant Bayesian hierarchical models. Associations between each covariate included in the final joint model and spina bifida. Table S3. Associations of hazardous air pollutants selected in the final joint model using Stochastic Search Variable Selection (SVSS) and spina bifida: Unadjusted results. Table S4. Associations of hazardous air pollutants and spina bifida: Multivariable results without SVSS. (DOCX 30 KB)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Swartz, M.D., Cai, Y., Chan, W. et al. Air toxics and birth defects: a Bayesian hierarchical approach to evaluate multiple pollutants and spina bifida. Environ Health 14, 16 (2015).

Download citation


  • Bayesian hierarchical models
  • Birth defects
  • Hazardous air pollutants
  • Maternal exposure
  • Multi-pollutant
  • Spina bifida