This study was approved by the University of Pittsburgh Institutional Review Board (IRB number PRO10010240).
Ascertainment of cases
Cases of ASD for this study were children born between January 1, 2005 and December 31, 2009 in Allegheny, Armstrong, Beaver, Butler, Washington, or Westmoreland County in southwestern Pennsylvania and who were currently residing in the six-county area. Our goal was to enroll approximately half of the prevalent cases among this birth cohort. Based on 23,399 births in 2007 in the six county area [22] and a prevalence of ASD of 6 per 1,000 (one in 166), we estimated that 140–141 children per year would be diagnosed with ASD in the study area. We anticipated enrolling half (70–71) of these children each year to result in 250 cases during the 3½ year period of the study.
There is no autism registry in Pennsylvania, and therefore no centralized agency that could be accessed for permission to contact parents of children with ASD for the purposes of conducting a study. Our investigation used an extensive outreach campaign to recruit ASD cases from a combination of 1) ASD specialty diagnostic and treatment centers, 2) private pediatric and psychiatry practices, 3) school-based special needs programs (starting at age 5), and 4) autism support groups. Per our IRB guidelines, we were not allowed to directly contact parents of children with ASD. Therefore, we provided informational packets to these agencies and organizations with a letter, contact sheet and pre-addressed envelope to be returned to our office, and these agencies mailed the information to families with ASD. When a contact sheet was returned, we were permitted to contact the mother to describe the study and request consent to participate.
A case of ASD was defined as any child 1) who scored a 15 or above on the Social Communication Questionnaire (SCQ), a positive screen for the presence of autistic features, and 2) for whom there was written documentation, including ADOS or other test results, of a diagnosis of an ASD from a child psychologist or psychiatrist. Cases were not included in the study if the child was adopted, parents were not English speaking, or a parent was not available for interview. A total of 217 cases were consented and interviewed for the study.
Ascertainment of controls
The study was designed to have two different sets of controls. The first control group (interviewed controls) was recruited from a random selection of 5007 births weighted by sex (4:1 male:female) from the Pennsylvania Department of Health (PA DOH) state birth registry files for 2005 to 2009 in the six-county area. Interviewed controls were frequency matched to the cases on year of birth, sex, and race. We recruited through a direct letter appeal signed by the Pennsylvania Secretary of Health. The rules for both the PADOH IRB and that of the University of Pittsburgh mandated that there would be no direct contact with potential controls, except for the opportunity to return an envelope and contact sheet indicating refusal to be in the study or an indication that the parent was interested in enrolling his or her child in the study by providing contact information. We requested a postal service return to sender, address correction requested. However, since we used an address from the birth certificate that was several years old, it is likely that many of the letters were never delivered to the intended resident or were simply ignored, making it difficult to determine a true response rate.
After we obtained informed consent, parents were screened for inclusion criteria and administered the SCQ for their child. Children with an SCQ >15 or with a reported diagnosis of ASD were not included as controls. Other exclusion criteria were the same as those for cases. The first control group consisted of 226 eligible controls that were consented and interviewed.
For each of the cases and the interviewed controls, a personal interview with the mother was conducted by trained interviewers using a structured questionnaire, adapted from the CDC’s Study to Explore Early Development (SEED). The questionnaire included parental demographic and socioeconomic information, a detailed residential history, maternal and paternal occupational history, family history of ASD, smoking history, maternal reproductive and pregnancy history, and child’s medical history. Data was obtained on all residential addresses and the corresponding start and end dates that the mother/child lived at those addresses from three months prior to last menstrual period (LMP) until the child’s second birthday.
The second control group (birth certificate or BC controls) consisted of a random sample of births occurring from 2005 to 2009 for the six county area of study, weighted with a male to female ratio of 4:1 and year of birth. Birth certificate information on the cases and controls, consisting of residence at birth, age of mother, smoking history, maternal education, race and other infant characteristics, was then used for the second case–control analysis. Of the total sample of 5,007 birth certificates, 16 were identified as being in our case (ASD) population and were removed from the control group.
Exposure assessment
Exposure to ambient hazardous air pollution concentrations was estimated using modeled data from the 2005 NATA assessment. The 2005 NATA estimates are an annual average by census tract and were downloaded from the US EPA website (http://www.epa.gov/ttn/atw/nata2005/tables.html accessed April 16, 2014). Out of the 177 air toxics available through NATA, we examined the distribution, variability, and correlations of 37 air toxics characterized as having neurological, developmental or endocrine-disrupting effects by one of the previous studies [17–19] or the US EPA [20]. Seven chemicals (carbon tetrachloride, chloroform, ethylene dibromide, ethylene dichloride, hexachlorobenzene, methyl chloride, and PCBs) were excluded from further analysis due to little diversity in their distributions within the six-county area, leaving a total of 30 NATA compounds for analysis.
For the analyses of the interviewed cases and controls, the residential addresses obtained during the interview were geocoded to an X, Y coordinate using ArcGIS (version 10.1; ESRI Inc., Redlands, CA) and verified manually. When an address could not be successfully geocoded in ArcGIS, other methods were used, including MapQuest Latitude/Longitude Finder (http://developer.mapquest.com/web/tools/lat-long-finder). Year 2000 census tracts (11 digit FIPS codes) for each address were assigned using ArcGIS 10.1, linking to 2009 Tiger Line files for the 2000 United States census. We calculated person-specific exposure estimates for each of the air toxic compounds, taking into account the locations of and changes in residence and the time spent at each residence. For each child, average exposure estimates were computed for the time periods of pregnancy, first year of life, and second year of life. Two participants who lived at a residence outside of the United States for which no NATA data was available were excluded from analysis, leaving an analytic group of 217 cases and 224 controls.
For the birth certificate data analysis, NATA concentrations were linked to census tract of residence at birth. All births that could be linked to a PA DOH birth certificate either contained the census tract of birth or the zip code of birth. When only zip code was provided, the 2010 ZCTA shapefile was used to calculate the geographical center of each zip code in ArcGIS 10.2. Then, each ZCTA centroid was spatially linked to the 2000 census tract that contains it. Of the 217 cases, one of the births could not be linked to its birth certificate, 187 had a census tract on the birth certificate, and 29 only had a zip code of birth. Of the 5,007 potential controls, 16 births were actually in our case population, 4,194 had a census tract on the birth certificate, and 797 only had a zip code of birth. However, 20 control births could not be assigned a NATA exposure: Eighteen could not be linked to a census tract as the documented zip code was not in the 2010 ZCTA shapefile, and two had census tracts documented on the birth certificate that did not match a census tract in the 2005 NATA database. Therefore, the final population in the analysis of BC controls was 216 cases and 4,971 controls.
Statistical analysis
We used logistic regression to investigate the association between exposure to NATA air pollutants and the risk of autism spectrum disorder. In order to calculate individual odds ratios, quartile cut points were calculated for each of the 30 NATA pollutants. These were based on the distribution among the interviewed controls for use in each respective case–control comparison. The three highest quartiles were individually compared to the lowest quartile. For the interviewed cases and controls, separate logistic regression models were conducted for each pollutant during the pregnancy period and secondarily for the first and second year of life. For the birth certificate control comparison, only residence at the time of birth was available. All analyses were adjusted for maternal age, education, race, smoking, child’s birth year and child’s sex.
In addition to examining compounds individually, we also grouped compounds by structural properties into three classifications: metals excluding selenium (arsenic, cadmium, chromium, lead, manganese, mercury, and nickel), aromatic solvents (benzene, ethyl benzene, styrene, toluene and xylenes), and chlorinated solvents (methylene chloride, perchloroethylene, trichloroethylene, trichloroethane, and vinyl chloride). Index scores were computed for each of the structural groups of metals, aromatic solvents, and chlorinated solvents by summing the quartiles for the compounds in each group. Similar to what was done for the individual compounds, quartile cut points of these scores were calculated based on the distribution of the index scores among the interviewed controls. Logistic regression models comparing highest quartiles to the lowest quartile were conducted for each of the indices for both the interviewed and BC comparisons, controlling for mother’s age, education, race, smoking, child’s year of birth and sex. IBM SPSS Statistics 20 and 22 were used for all analyses. No formal adjustment was made for multiple comparisons.
Additionally, we noted a significantly higher number of multiple births reported among cases compared to controls (8.4 % among the cases; 4.0 % and 3.8 % among the interviewed and birth certificate control groups, respectively). As there is a high rate of prematurity and other problems associated with multiple births, we conducted a sensitivity analysis with and without the inclusion of multiple births for both case–control comparisons.
One of the last steps involved a backward multiple logistic analysis of all agents identified as significant in either case–control comparison with adjustment for mother’s age, race, education, smoking, child’s birth year, and child’s sex. This was done in order to consider the most significant effects of NATA compounds while controlling for the same covariates that were used in the previous logistic regression models for individual pollutants.
Finally, air toxics are often correlated with each other, and people are often simultaneously exposed to a complex mixture of air pollutants. In our study, the Spearman correlation matrix revealed that many of the air toxics were highly correlated (p < 0.01). Similar to the methodology detailed by von Ehrenstein et al [21], we conducted a factor analysis to further examine the correlation structure of our set of 30 air toxics. Factors were extracted using Principal Component Analysis (PCA) and rotated using varimax rotation. The eigenvalue >1 rule was used to determine which factors to retain [21].