A Method to screen U.S. environmental biomonitoring data for race/ethnicity and income-related disparity

Background Environmental biomonitoring data provide one way to examine race/ethnicity and income-related exposure disparity and identify potential environmental justice concerns. Methods We screened U.S. National Health and Nutrition Examination Survey (NHANES) 2001–2008 biomonitoring data for 228 chemicals for race/ethnicity and income-related disparity. We defined six subgroups by race/ethnicity—Mexican American, non-Hispanic black, non-Hispanic white—and income—Low Income: poverty income ratio (PIR) <2, High Income: PIR ≥ 2. We assessed disparity by comparing the central tendency (geometric mean [GM]) of the biomonitoring concentrations of each subgroup to that of the reference subgroup (non-Hispanic white/High Income), adjusting for multiple comparisons using the Holm-Bonferroni procedure. Results There were sufficient data to estimate at least one geometric mean ratio (GMR) for 108 chemicals; 37 had at least one GMR statistically different from one. There was evidence of potential environmental justice concern (GMR significantly >1) for 12 chemicals: cotinine; antimony; lead; thallium; 2,4- and 2,5-dichlorophenol; p,p’-dichlorodiphenyldichloroethylene; methyl and propyl paraben; and mono-ethyl, mono-isobutyl, and mono-n-butyl phthalate. There was also evidence of GMR significantly <1 for 25 chemicals (of which 17 were polychlorinated biphenyls). Conclusions Although many of our results were consistent with the U.S. literature, findings relevant to environmental justice were novel for dichlorophenols and some metals.


Background
Environmental justice (EJ) concerns can arise when racial/ ethnic minorities or those with lower socioeconomic status (SES) experience greater exposures to environmental pollutants than the rest of the population. Demographic variables used to characterize SES can include income, education, or occupation. Many EJ studies have focused on disparities in exposure to ambient air pollutant levels. Studies on hazardous air pollutants have found higher cancer risks associated with lower SES, higher proportion of African Americans, and lower proportion of whites in a census tract [1]; higher level of racial segregation [2]; and higher proportion of Hispanics [3]. U.S. counties with the highest fine particulate matter (PM 2.5 ) and ozone levels had higher percentages of people living in poverty and non-Hispanic black residents [4], and Hispanics and non-Hispanic blacks had higher exposures to PM 2.5 components than whites [5].
Environmental biomonitoring-assessing exposure to pollutants/chemicals by measuring them or their metabolites in blood, urine, or other specimens-provides a complementary approach to examining potential disparities and identifying EJ concerns. Biomonitoring concentrations (i.e., biomarkers) reflect the amount of chemical entering the body from all sources (air, water, food, soil, dust, consumer products) via all exposure routes (ingestion, inhalation, dermal absorption) [6]. One chemical may be assessed in the body using several biomarkers (e.g., lead in blood and urine). Biomarkers are particularly informative when source-and route-specific data are limited. However, detailed studies are required to link biomarker concentrations back to environmental exposures for policy-setting purposes. Biomarkers also reflect how a given individual absorbs, distributes, metabolizes, and excretes the chemical (i.e., toxicokinetics), all of which may be influenced by genetic and epigenetic characteristics that could vary by race/ethnicity or SES [7,8]. Furthermore, the presence of an environmental chemical in an individual's blood or urine does not imply that this chemical causes disease [6].
To date, most detailed studies of race/ethnicity or income-related disparities using biomarker data have been hypothesis-driven, focusing on a few chemicals selected based on known or suspected exposure disparities and controlling for relevant covariates. This approach may miss important disparities in exposure to less studied chemicals. A screening-level analysis of a large number of biomarkers for differential exposure could identify additional candidates for detailed study of the potential magnitude, drivers, and public health relevance of any race/ethnicity or income-related disparities.
The U.S. Centers for Disease Control and Prevention (CDC) collects and tracks environmental biomonitoring data through the National Health and Nutrition Examination Survey (NHANES). The Fourth National Report on Human Exposure to Environmental Chemicals [6] examines concentrations of 212 chemicals in NHANES 1999NHANES -2004, providing means and select percentiles stratified by survey years, age group, sex, and race/ethnicity. (Tables were recently updated for 117 chemicals, and they incorporate 34 new chemicals from NHANES 2005-2010 [9].) However, the report does not statistically compare biomarkers across racial/ethnic subgroups, or consider income, an important EJ dimension.
One example of an exploratory assessment for a large number of chemicals is a recent study by Tyrrell et al. that investigated associations between income and levels of 179 chemicals in NHANES 2001-2010 [10]. The authors used linear regression modeling to test for significant associations between the poverty income ratio (PIR) and log-transformed biomarker concentrations, controlling for age, sex, race, and waist circumference. For chemicals with significant negative PIR associations in at least two NHANES cycles, Tyrrell et al. used structural equation modeling to explore the pathways through which income impacts the biomarker concentrations. However, Tyrrell et al. did not use a formal procedure to adjust for multiple testing, implying that some of their significant findings could be spurious.
To demonstrate a formal screening method, we analyzed all biomarkers in the NHANES 2001-2008 datasets for differences in concentration across U.S. population subgroups defined by race/ethnicity and income. We build upon the Fourth Exposure Report and on Tyrrell et al. [10] by: (1) modeling joint impacts of race/ethnicity and income; (2) testing for statistically significant evidence of disparity with proper adjustments for multiple comparisons; and (3) addressing measurements below the limit of detection (LOD) using variable-threshold censored regression. This screening method focuses on differences in mean biomarker concentrations among subgroups.

Data
NHANES collects nationally representative environmental biomonitoring data from approximately 2,500 participants in each two-year cycle [6]. Ethical approval for use of NHANES data that is freely available on the web is not required as it is anonymized. We analyzed data from 19 [11][12][13][14]. We aggregated chemicals into 10 groups: cotinine, halogenated aromatics, metals, polycyclic aromatic hydrocarbons (PAHs), polyfluoralkyl chemicals (PFCs), perchlorate, pesticides, phenols, phthalates, and volatile organic compounds (VOCs) [see Table A1 in Additional file 1]. Each chemical could be measured in different media and/or using different corrections; we defined these different measures as separate biomarkers. For example, the chemical lead (Pb) was measured in blood and urine, reported as both unadjusted and creatinine-corrected. Thus, there were three biomarkers associated with the chemical Pb. We analyzed a total of 410 biomarkers corresponding to 228 chemicals. We parallel the presentation of units in the Fourth Exposure Report [6]. We present urinary concentrations per volume of urine and per gram of creatinine. While creatinine correction should account for urine dilution in spot urine samples, creatinine levels can vary by age, sex, race, renal function, lean muscle mass, and red meat consumption [15]. Lipophilic chemicals (such as dioxins, furans, and polychlorinated biphenyls [PCBs]) are presented per gram of total lipid (reflecting the amount stored in body fat) as well as per whole weight of serum. Other chemicals measured in serum are presented per liter of serum. For each biomarker, we calculated the LOD by multiplying reported concentrations by √2 for observations flagged "below LOD." We computed the relevant summary measures for Mexican American, non-Hispanic black, and non-Hispanic white race/ethnicity categories available in NHANES, but not the other Hispanic or "other race" categories because their small sample sizes do not permit generating reliable estimates [16] and because of potential heterogeneity of exposure patterns in these subpopulations [17]. To categorize participants by income, we used the PIR reported by NHANES. PIR is a family's total income divided by the family size-specific poverty threshold income, which is published in the Federal Register by the U.S. Department of Health and Human Services. While some NHANES studies used a three-way PIR-based classification, e.g., poor (PIR < 1), near poor (1 ≤ PIR < 2), and not poor (PIR ≥ 2) [18,19], we found that a three-way PIR-based categorization often results in small subgroup sample sizes when combined with a three-way race/ethnicity-based categorization. Instead, we employed a pseudo-balanced two-way categorization (since the unweighted median PIR in our dataset was close to 2), defining "Low" Income (PIR < 2) and "High" Income (PIR ≥ 2) subgroups. A PIR threshold of 2 is used by some U.S. agencies as a qualifier for government assistance [20] and was also used to explore Vitamin D deficiency using NHANES data [21].
Thus, we classified individuals into six race/ethnicity and income subgroups, with non-Hispanic white/High Income serving as the reference subgroup. For each biomarker, we analyzed data for all participants with nonmissing biomarker measurements and PIR. There were no individuals with missing race/ethnicity status in the NHANES datasets we examined. Depending on the biomarker, the final analytic sample included between 90% and 95% of all participants with non-missing biomarker measurements.

Analysis
Following CDC [6], we assumed that biomarker concentrations could be treated as lognormally distributed, and used the geometric mean (GM) as the measure of central tendency. Biomarker concentrations were censored by the LOD, which could be individual-specific for some biomarkers. While replacement of concentrations below the LOD by LOD/√2 has been employed [6], this type of substitution has been shown to generate biased estimates [22,23]. In our analysis, we accommodated the LOD censoring by estimating variable-threshold censored regression models [22,24].
Specifically, for each biomarker b we evaluated the following pseudo-log-likelihood function ln L b : where c bi is the concentration of biomarker b measured in the ith individual, LOD bi is the LOD for that biomarker for the ith individual; w bi is the individualspecific survey weight; Φ(.) is the cumulative standard normal distribution and ϕ(.) is the standard normal distribution; and μ bi and σ bi are the arithmetic mean and the arithmetic standard deviation of ln c bi for the ith individual, respectively. We constrained μ bi (and σ bi ) to be the same for participants in the same subgroup s, permitting estimation of subgroup-specific GMs and geometric standard deviations (GSDs). In other words, we sought to maximize the pseudo-log-likelihood function in equation (1) under simple linear equality constraints. Ifμ bs andσ bs denote the estimates of μ bs and σ bs for biomarker b in subgroup s, then the estimated GM and GSD for this biomarker in this subgroup are expμ bs ð Þ and expσ bs ð Þ; respectively. We followed CDC's convention of not reporting GM estimates for subsamples with >40% of results below the LOD [6]. We estimated sampling variances of parameters in the constrained version of the model in equation (1) using the Taylor series method [25], which relies on results in Binder [26]. Methods described in [27] were used to generate point and range estimates for subpopulations of interest. Specifically, to accommodate the complex design and laboratory subsample weights in our subpopulation analysis, we employed Stata/SE 11.2 [28,29] < svy, subpop(if …): intreg > programming statements. When laboratory weights were not provided, we followed CDC/NCHS [25] and used the two-year examination weights. Weighted NHANES estimates are representative of the U.S. civilian, non-institutionalized population.

Testing for disparity
We assessed potential race/ethnicity and income-related disparity at the center of each biomarker distribution using the following metric: where GMR bs is the ratio of the GM of biomarker b in subgroup s (GM bs ) with respect to the GM of biomarker b in the reference subgroup r (GM br ). Non-Hispanic white/High Income was the reference subgroup. Up to five GM comparisons could be made for each biomarker. A particular subgroup-specific GM bs is not different from the reference subgroup GM br when GMR bs = 1. For each biomarker b and subgroup s, we tested the null hypothesis that GMR bs = 1 using two-sided tests because we had no a priori beliefs about directionality.
Because GMR involves a non-linear transformation of equation (1) parameters, whose estimators are t-distributed, the sampling distribution of the GMR estimator expμ bs −μ br ð Þ is not known. Therefore, the tests were carried out in the log-space, by evaluating the hypothesis μ bs − μ br = 0 rather than GMR bs = 1. Along with estimatesμ bs andμ br , Stata/SE 11.2 reports survey designadjusted estimates of the relevant variances-V μ bs ð Þ , V μ br ð Þ -and covariances-Ĉ μ bs ; μ br ð Þ . Under the null hypothesis, the sampling distribution of q is a central t-distribution with degrees of freedom determined by the NHANES survey design features. This distribution was used to derive p-values for each test. The confidence intervals for the GMRs were obtained by exponentiating the confidence intervals for μ bs − μ br .
Our screening analysis involved multiple testing of the hypothesis GMR bs = 1 for several subgroup-specific GMRs and a large number of biomarkers. A large number of false positives is expected with this many comparisons. Therefore, we capped the probability of encountering at least one false positive among all tests at 0.05 using the Holm-Bonferroni procedure [30]. That is, we controlled the family-wise error rate (FWER) at 5%, where the family of tests was the entire collection of comparisons. This allowed us to summarize the screening results for all biomarkers and subgroups together [31]. This approach follows best practices in biomedical research and conforms to the guidelines of the U.S. Food and Drug Administration, which recommends controlling the FWER in clinical trials [32]. Last, we qualitatively validated our statistically significant results by reviewing the published literature on those biomarkers for evidence of disparity.

Results
Although we examined 228 chemicals, there were only 108 chemicals for which at least one GMR could be estimated. Among the 795 GM comparisons across subgroups and biomarkers, there were 37 chemicals with significant evidence of disparity: 12 chemicals with at least one GMR significantly >1, indicating potential EJ concerns, and 25 chemicals with at least one GMR significantly <1, indicating higher exposures in the reference subgroup (non-Hispanic white/High Income). Additional information on the overall GMR screening results at the comparison, biomarker, and chemical level is provided in Table A2 [see Additional file 1]. Figure 1 provides a visual overview of the results. A few broad patterns can be discerned. First, the predominance of grey indicates that many GMRs could not be calculated because of the large number of nondetectable concentrations. Second, the relatively small number of red and blue cells indicates that the GM concentrations in the subgroups were rarely significantly different from those of the reference subgroup for the biomarkers with computable GMRs. This could be due to the fact that the differences were not large or there was insufficient power to detect these differences. Third, there were instances where Mexican Americans, particularly low income, had significantly lower levels of biomarkers than the reference subgroup. Fourth, biomarker levels for low-income, non-Hispanic whites were generally similar to those for high-income, non-Hispanic whites (the reference subgroup). Finally, evidence of significant EJ disparity is generally seen in non-Hispanic blacks (low and high income) and low-income Mexican Americans.
Examining results at the chemical level, additional observations can be made. Pesticides, phthalates, and cotinine contained biomarkers for which all GMRs significantly different from one were also >1, indicating potential EJ concern. Conversely, halogenated aromatics (PCBs in this case), PFCs, and perchlorate included biomarkers for which GMRs significantly different from one were exclusively <1, indicating higher exposures in the reference subgroup. Mixed results (GMRs both significantly >1 and <1) were encountered among phenols and metals. No evidence of significant disparity was found for PAHs or VOCs. However, a large fraction of GMRs could not be estimated for VOCs, pesticides, or halogenated aromatics. Table 1 presents information on the 12 chemicals corresponding to 31 GMRs significantly >1, indicating potential EJ concerns for the following chemical groups: cotinine, metals, pesticides, phenols, and phthalates. Of the 31 GMRs >1, there were 14 for the non-Hispanic black/Low Income, 10 for the non-Hispanic black/High Income, 5 for the Mexican American/Low Income, 1 for the Mexican American/High Income, and 1 for the non-Hispanic white/Low Income subgroups. Sample sizes were consistently smallest for the Mexican American/ High Income subgroup. The GMRs in Table 1 range from 1.3 to 12, but should not be compared across biomarkers except with great caution because of the differences in variability of the concentration levels across biomarkers. Table 2 presents information on the 25 chemicals corresponding to 55 GMRs significantly <1, indicating higher GMs in the non-Hispanic white/High Income reference subgroup. PCBs accounted for 17 of these 25. Of the 55 GMRs <1, most (41 PCB congeners) were among the halogenated aromatics, with others found among metals (7), perchlorate (2), PFCs (3), and phenols (2). For PCBs, many instances of GMR < 1 occurred for the Mexican American/Low Income subgroup.
Of the 12 chemicals our screening method identified as having higher concentrations in low-income or minority groups (Table 1), we found published evidence of EJ concern for cotinine, lead, p,p'-dichlorodiphenyldichloroethylene (DDE), methyl and propyl paraben, phthalates, and antimony (Sb), and no published evidence for thallium (Tl) or dichlorophenols.

Cotinine
We found income-related disparity in cotinine and other tobacco smoke biomarkers (e.g., Pb and Sb). There is an established literature on higher smoking rates in lowincome U.S. subpopulations [33].

Lead
We found significantly higher blood and urine Pb among low-income, non-Hispanic blacks, despite the fact that blood and urine Pb have been found to be weakly correlated and that blood Pb is considered a more reliable biomarker than urine Pb [34,35]. Our finding agrees with Pirkle et al. [36], who found the covariates non-Hispanic black race and low income to be significantly positively associated with blood Pb across all age groups in a multiple regression analysis, using NHANES 1991-1994 data, and with results reported in Tyrrell et al. [10].

DDE
We found elevated serum p,p'-DDE-a ubiquitous, neurotoxic dichlorodiphenyltrichloroethane (DDT) metabolite- Figure 1 Visual overview of GMR for NHANES environmental biomonitoring data for all subgroups. Each cell of the matrix summarizes the outcome of the geometric mean ratio (GMR) test performed. The key to the color codes is located under the matrix. The columns correspond to the five race/ethnicity subgroups (Mexican American/High Income; Mexican American/Low Income; non-Hispanic black/High Income; non-Hispanic black/Low Income; non-Hispanic white/Low Income) that are being compared to the reference subgroup (non-Hispanic white/High Income). The rows of the matrix correspond to 410 studied biomarkers. The chemical groups to which these biomarkers belong (cotinine; halogenated aromatics; metals; polycyclic aromatic hydrocarbons, PAHs; polyfluoralkyl chemicals, PFCs; perchlorate; pesticides; phenols; phthalates; volatile organic compounds, VOCs) are indicated along the right edge of the matrix.      We also found methyl paraben to be elevated among low-income, non-Hispanic blacks. The methyl paraben result for high-income blacks was not sensitive to whether the measurements were creatinine-corrected, consistent with descriptive statistics reported by CDC for non-Hispanic blacks [9].

Phthalates
We found higher diethyl phthalate (urinary mono-ethyl phthalate) and dibutyl phthalate (urinary mono-isobutyl and mono-n-butyl phthalate) metabolites in low-income minority subgroups, with higher mono-ethyl phthalate also in high-income, non-Hispanic blacks. Race/ethnicity differences in exposure to these phthalate metabolites were previously documented [40], and evidence regarding income-related differences is conflicting. Higher exposures to summed urinary metabolites of low-molecular-weight phthalates were reported for minority and for lowerincome children (ages 6-19) [41]; inverse associations between dibutyl phthalate metabolites and income (controlling for race) were also found in NHANES 2001-2010 [10]. Controlling for SES (an index including income, education, and food security), elevated urinary mono-ethyl phthalate and dibutyl phthalate metabolites were found in minority reproductive-age women, but SES itself was insignificant in the presence of minority status controls [42].

Sb and Tl
We found higher urinary Sb (uncorrected) in non-Hispanic blacks (low and high income) and higher urinary Tl (uncorrected) in low-income non-Hispanic blacks. Richter et al. [43] found higher urinary Sb in NHANES 1999-2004 non-smokers with environmental tobacco smoke exposure compared to non-smokers with no such exposure, but no difference by race/ethnicity. In contrast, they found lower urinary Tl in smokers versus nonsmokers. Tyrrell et al. reported negative associations between Sb and income (controlling for race) [10], but no race-related differences.

Dichlorophenols
We found elevated 2,4-and 2,5-dichlorophenol (DCP) in non-Hispanic blacks (low and high income). Evidence of EJ concerns exists for the 2,5-DCP parent compound (1,4-dichlorobenzene), with elevated blood levels found in Mexican Americans and non-Hispanic blacks [44,45]. Urinary 2,5-DCP was found to be significantly lower in non-Hispanic white girls compared to non-Hispanic black girls participating in a breast cancer study [46].

Chemicals with higher concentrations in the reference subgroup
The screening method identified 25 chemicals with significantly higher biomarker levels in high-income, non-Hispanic whites (Table 2), with previously published evidence of disparity for most. We found lower serum levels of 17 PCB congeners in Mexican Americans than in high-income, non-Hispanic whites, consistent with other NHANES studies [47,48] and regional U.S. studies [49]. All of the congeners with significant differences, except PCB118 and PCB156, were non-dioxin like, and results were not sensitive to the lipid adjustment. We found lower total blood mercury (Hg) in low-income, non-Hispanic whites and Mexican Americans, consistent with studies reporting lower Hg levels among Mexican Americans [50,51] and inverse associations between Hg and income [10]. We found lower urinary perchlorate in non-Hispanic blacks (low-and high-income) versus high-income, non-Hispanic whites in , we did not locate any published evidence of race/ethnicity-related disparities in these metals.

Discussion
Utility and performance of the screening method This analysis provides a formal method to screen for exposure disparities in NHANES environmental biomonitoring data across race/ethnicity and income. The screening method identified differential exposure at the mean for 59 of the 204 (29%) biomarkers examined, with some instances of potential EJ concern and others where the reference subgroup (non-Hispanic white/High Income) had higher exposures. Using the published literature as a qualitative validation tool, the method correctly identified five chemicals/chemical classes with published evidence of higher biomarker levels in low-income or minority groups (cotinine, lead, DDE, parabens, and phthalates), and five chemicals/chemical classes with higher levels in highincome whites (PCBs, Hg, perchlorate, PFOA/PFOS, and benzophenone-3). It also found differential exposures for seven chemicals (2,4− and 2,5−DCP, Tl, Sb, Ba, Co, Cs) for which no published evidence of differences by race/ ethnicity or income exists. The screening method is an approach that users of NHANES biomonitoring data could employ to obtain new and robust insights into the nexus between chemical exposures and diverse populations.

Public health relevance of initial screening results
The main objective of this work was to develop an EJ screening method for the NHANES biomarker data.
Because we used only one cycle of NHANES data to develop the method (the most recently available per chemical), our actual screening results should be viewed as preliminary. Furthermore, there are many other potentially important race/ethnicity disparities that we were unable to evaluate because the NHANES dataset contains sufficient sample sizes to reliably analyze only Mexican American, non-Hispanic white, and non-Hispanic black subgroups [16]. Nonetheless, we did find evidence, supported by the published literature, of EJ concern in biomarker levels of cotinine, Pb, DDE, parabens, and phthalates. While smoking is not typically viewed as an EJ issue, higher cotinine levels in low-income, non-Hispanic whites and blacks indicate higher smoking-related health burdens. This merits further investigation into factors driving higher smoking rates and secondhand smoke exposures, so that effective prevention strategies can be developed. We also found higher Pb in low-income blacks. An extensive literature points to indoor/housing-related factors (e.g., house dust, tobacco smoke, housing age/ condition/geographic location) as important drivers of Pb exposure in the United States, with dietary, toxicokinetic, and genetic factors influencing biomarker differences [56]. With research demonstrating adverse effects at ever-decreasing Pb levels, including associations with cardiovascular outcomes [35,57], the public health impacts of Pb disparities are potentially large. We found higher DDE levels among Mexican Americans. Since prenatal p,p'-DDE exposure is associated with adverse neurodevelopmental outcomes [37], the public health impacts of this disparity could be significant. We found higher paraben levels among high-income, non-Hispanic blacks. Parabens are antimicrobial preservatives with weak estrogenic properties used in cosmetics, pharmaceuticals, and some processed foods [58]. Exposure differences are likely due to product use or diet, although indoor air and house dust may be important [59]. We found certain phthalate metabolites higher in low-income minority subgroups and high-income, non-Hispanic blacks. Phthalates are ubiquitous plasticizers, with diet and consumer products considered important exposure sources [6]. Human health implications of phthalate exposure is an active research area, with some suggestion of endocrinedisrupting effects [41,60].
We also found previously undocumented evidence of disparities in biomarker levels of Sb, Tl, and 2,4-and 2,5-DCP for non-Hispanic blacks. Sb and Tl are toxic metals used in a range of industrial processes. Anthropogenic sources include power plants (both), traffic emissions and brake dust (Sb), tobacco smoke (both), mining operations (Sb), cement factories and smelters (Tl), and waste sites (both) [61]. People are exposed to Sb primarily through food and to Tl through industrial processes [6]. Human health effects from Sb or Tl at low environmental doses are unknown [6]. 2,4-DCP is a metabolite of several herbicides, organophosphate, and organochlorine pesticides (including other chlorophenols), while 2,5-DCP is a metabolite of several organochlorine pesticides (including 1,4-dichlorobenzene, a deodorizer and moth repellent) [6,62,63]. They can also be used in water chlorination [64]. In terms of potential health effects, food allergy sensitization was more common in NHANES 2005-2006 participants with levels of urinary 2,4-and 2,5-DCP above the 75th percentile [64], and lower age of menarche was associated with 2,5-DCP and aggregated 2,4-and 2,5-DCP in NHANES 2003-2008 female participants 12-16 years of age [65].
We also found evidence, supported by the literature, of lower levels of certain chemicals in low-income and minority subgroups versus high-income non-Hispanic whites. PCB levels were lower in Mexican Americans, most likely due to differences in diet, the younger average age of Mexican Americans (34 years; 95% CI 31-36) versus whites (44 years; 95% CI [43][44][45], and the large fraction (0.56; 95% CI 0.42-0.66) of Mexican-born participants (who have been shown to have lower levels of PCB153 than U.S.-born Mexican Americans [48]) in the Mexican American subgroups in the NHANES 2003-2004 data. Hg levels were lower in low-income non-Hispanic whites and Mexican Americans, consistent with studies linking higher income with higher Hg intake through fish consumption [6,10,50,51]. We found lower perchlorate in non-Hispanic blacks, and lower PFOA and PFOS in the low-income minority subgroups. Perchlorate is a thyrotoxic natural and anthropogenic contaminant found in food (vegetables, milk) and drinking water, depending on location. PFOA and PFOS (phased out of U.S. production in 2002) are persistent manmade chemicals with a range of applications (e.g., waterproofing, protective coatings) and suspected health effects. Levels of benzophenone-3, a suspected endocrine-disrupting sunscreen used in cosmetics, sunscreen, and food packaging, were lower in high-income non-Hispanic blacks.
Last, we found previously undocumented evidence of lower Ba, Cs, and Co in non-Hispanic blacks compared to high-income non-Hispanic whites. Ba is a naturally occurring metal in food and drinking water, with industrial and medical applications [66]; disparities could indicate differences in diet, drinking water, or possibly access to colorectal screening. Americans are exposed to both stable (naturally occurring, and from forest fires, coal, and waste combustion; not considered a public health concern) and radioactive (from nuclear power plants, accidents, or weapon explosions) Cs isotopes through food, drinking water, and air; thus, disparities are likely due to differences in diet and geography [67]. Americans are also exposed to stable and radioactive Co isotopes through food, water, and air. Since Co is an essential micronutrient, exposure to typical environmental levels of stable Co is not considered harmful [68]. Urinary Cs and Co measurement methods do not distinguish between stable and radioactive species.

Limitations
We were unable to analyze 50% of the available NHANES biomarkers for disparity because the LOD censoring was often too high to yield a valid GM estimate. In the VOC chemical group this was true for 33 out of 39 biomarkers. However, the lack of information about race/ethnicity and income differences for these biomarkers should not be interpreted as the absence of such differences. With improvements in the sensitivity of analytical methods, LOD censoring should become less of a limitation.
For 71% of biomarkers, none of the estimated GMRs was significantly different from 1. This was the case for all 20 PAH biomarkers. The lack of significant findings for these biomarkers may be a consequence of insufficient statistical power; in other words, race/ethnicity and income differences may exist, but we were unable to detect them. Pooling biomarker data across several NHANES cycles would have increased sample sizes and, potentially, the precision of our estimates, but would not have altered our conclusions about the validity of the screening method itself.
To control the FWER in the family of 795 screening tests performed, we used the Holm-Bonferroni procedure. While this procedure was shown to be more powerful than the Bonferroni correction [30], it does not permit construction of simultaneous confidence intervals. Therefore, the confidence intervals for the GMRs reported in Table 1 and Table 2 are not adjusted for multiple comparisons. There are several other FWER control methods that have higher statistical power compared to the Holm-Bonferroni procedure. Adaptive Bonferroni methods require knowledge of (or assumptions about) data dependencies [69,70]. However, we could not infer the correlation structure across all NHANES biomarkers, because not all measurements were collected from the same individuals. Permutation-based methods, such as the MaxT test procedure [71], accommodate arbitrary dependency structures. However, they rely on the assumption that individual-level observations are exchangeable, which is difficult to justify for the complex survey data. Further, the MaxT procedure did not considerably outperform the Holm-Bonferroni procedure in terms of statistical power in some simulation experiments [72,73]. Thus, we used an approach to control the FWER that we felt was most appropriate for these data.
There are other types of disparity that we were unable to capture by screening at the means. Higher variability in a given biomarker concentration in a target subgroup (versus the reference subgroup) implies that, even with similar GMs, extreme values may be more frequent in that subgroup. We explored this in a complementary, upper-tail-oriented screening that defined extreme concentrations (as ≥95th percentile) and found few significant results. This was likely a consequence of the additional sampling uncertainty in the test statistic estimator used for this upper-tail screening, because the 95th percentile value had to be estimated from the data. When juxtaposed with results from screening at the mean, fewer significant findings at the upper tail could be misinterpreted as a relative absence of the upper-tail disparity. Therefore, we did not report the results of this analysis.
Ideally, an upper-tail screening analysis would be based on externally defined, non-occupational health-based thresholds, such as biomonitoring equivalents (BEs). A BE is a biological concentration of a chemical (or its metabolites) reflecting an existing health-based exposure guidance value, such as a reference dose [74]. BEs have been established for approximately 80 chemicals [74], but many NHANES chemicals still lack them. Additionally, grouping biomarkers of chemicals with shared toxicity pathways could help capture toxicity-relevant race/ethnicity and income differences in cumulative exposures.
Education and occupation are other important SES dimensions we did not examine explicitly because we analyzed biomarker data for all available ages, where these are not always applicable. For adults, education is typically correlated with income; thus, our income-related findings could be viewed more broadly as representing income-and education-related patterns. However, another study found that, while education and income were correlated, they were not associated with bisphenol A and PFC biomonitoring levels in the same way [54]. Some of the NHANES biomarker concentrations may have reflected occupational exposures, which may occur more frequently in lowincome subpopulations (and for some race/ethnicity subpopulations). However, if one assumes that worker exposures are higher than those of the general population, then our focus on disparities at the mean likely helped dampen the influence of occupation on our screening findings. Future detailed studies should consider occupation as an important potential source of variability in biomarker data and possible explanation for observed disparities. Unfortunately, the NHANES occupational codes typically do not contain the detail needed to identify specific high-exposure industries or job tasks.
Our screening analysis did not capture possible "hotspot" effects, such as elevated biomarker levels in communities near contaminant sources. Community-level occurrences of elevated concentrations are either diluted or missed altogether if these communities are not included in the NHANES sample. Although the NHANES geographic identifiers are available through special arrangement, accessing them requires additional time and resources; only a few researchers to date have attempted this [75]. Further, having the geographic identifiers alone cannot help elucidate whether biomarker disparities are due to environmental contamination without the corresponding local environmental data (e.g., air, drinking water, soil, house dust, food measurements), which NHANES generally lacks. For this, we need more detailed studies, such as those described in the introduction [3,4], matching environmental, geographic, and SES data.
Interpreting urinary biomarker levels when results differ by creatinine correction can be challenging. Because non-Hispanic blacks have higher creatinine excretion [15], GMRs that were not significant for creatinine-corrected concentration (but significant for uncorrected concentration) may reflect creatinine excretion rather than exposure differences. However, creatinine also varies by several other factors (e.g., age, sex, renal function, lean muscle mass, red meat consumption [15]). We did not account for these factors in our analysis, clouding the interpretation of different results for urinary concentrations expressed in different units. Other approaches to account for urine dilution (e.g., by specific gravity [76]) may be preferable when 24-hour samples are unavailable.
Other limitations of our analysis relate to the inherent characteristics of a screening-level exploration versus a detailed EJ analysis of the NHANES biomonitoring data. Several studies focusing on clusters of a few chemicals have demonstrated the value of individual-level covariates -such as age, sex, education, occupation, smoking, diet, and body mass index-in explaining biomarker differences across EJ subgroups. However, the set of important covariates could also include genetic/epigenetic characteristics that influence toxicokinetics, resulting in different internal doses for individuals with the same external exposures [7,8]. Although NHANES is a rich source of individual-level information, it does not provide genetic/ epigenetic data.
This screening analysis focused on identifying race/ ethnicity and income differences in mean concentrations for a large number of the NHANES biomarkers, rather than on interpreting these differences. Making inferences about factors that can account for these observed differences should be assisted by a correctly specified model of individual-level internal exposures that includes all relevant covariates. It was not feasible to build a comprehensive model for each biomarker in our study. Further, including just a few covariates (e.g., age and sex) was likely to produce models subject to omitted-variable bias and, consequently, faulty inferences about the relative importance of these covariates in explaining the mean differences in exposure across subgroups. Therefore, we focused on a simpler screening that could potentially be useful for identifying candidate chemicals for more detailed EJ-oriented assessments.

Conclusions
This analysis explored differences in exposure to environmental chemicals (using biomarkers) across the dimensions of race/ethnicity and income in the United States. Many findings were consistent with previous studies, while some findings were new. Screening analyses of this type can be useful in identifying chemicals for focused study. Researchers wishing to extend our analyses might consider upper-tail screening using BE-based thresholds, exploring patterns in cumulative exposure (by grouping biomarkers with shared toxicity), or examining effects of creatinine correction and lipid adjustment on findings for certain chemical groups. Incorporating additional years of NHANES data as they become available could help identify persistent disparities requiring public health attention.

Additional file
Additional file 1: A Method to screen U.S. environmental biomonitoring data for race/ethnicity and income-related disparity. A description of chemical groups and their corresponding NHANES laboratory files and an overview of significant GMR findings.