Effects of air pollution on neonatal prematurity in guangzhou of china: a time-series study

Background Over the last decade, a few studies have investigated the possible adverse effects of ambient air pollution on preterm birth. However, the correlation between them still remains unclear, due to insufficient evidences. Methods The correlation between air pollution and preterm birth in Guangzhou city was examined by using the Generalized Additive Model (GAM) extended Poisson regression model in which we controlled the confounding factors such as meteorological factors, time trends, weather and day of the week (DOW). We also adjusted the co linearity of air pollutants by using Principal Component Analysis. The meteorological data and air pollution data were obtained from the Meteorological Bureau and the Environmental Monitoring Centre, while the medical records of newborns were collected from the perinatal health database of all obstetric institutions in Guangzhou, China in 2007. Results In 2007, the average daily concentrations of NO2, PM10 and SO2 in Guangzhou, were 61.04, 82.51 and 51.67 μg/m3 respectively, where each day an average of 21.47 preterm babies were delivered. Pearson correlation analysis suggested a negative correlation between the concentrations of NO2, PM10, SO2, and temperature as well as relative humidity. As for the time-series GAM analysis, the results of single air pollutant model suggested that the cumulative effects of NO2, PM10 and SO2 reached its peak on day 3, day 4 and day 3 respectively. An increase of 100 μg/m3 of air pollutants corresponded to relative risks (RRs) of 1.0542 (95%CI: 1.0080 ~1.1003), 1.0688 (95%CI: 1.0074 ~1.1301) and 1.1298 (95%CI: 1.0480 ~1.2116) respectively. After adjusting co linearity by using the Principal Component Analysis, the GAM model of the three air pollutants suggested that an increase of 100 μg/m3 of air pollutants corresponded to RRs of 1.0185 (95%CI: 1.0056~1.0313), 1.0215 (95%CI: 1.0066 ~1.0365) and 1.0326 (95%CI: 1.0101 ~1.0552) on day 0; and RRs of the three air pollutants, at their strongest cumulative effects, were 1.0219 (95%CI: 1.0053~1.0386), 1.0274 (95%CI: 1.0066~1.0482) and 1.0388 (95%CI: 1.0096 ~1.0681) respectively. Conclusions This study indicates that the daily concentrations of air pollutants such as NO2, PM10 and SO2 have a positive correlation with the preterm births in Guangzhou, China.


Background
Air pollution affects the health of children as well as the elderly, and it has been increasingly noticed and also studied in the recent years as a new public health challenge. Some studies have already shown that air pollution is associated with an increased risk rate of adverse pregnancy outcomes [1,2]. International survey data showed a 7-10% premature rate [3] in the industrialized countries, and 9-12% in United States in recent years, displaying an upward trend [4]. A survey in China indicated a 5-15% preterm rate increasing [5].
Fifteen percent of preterm babies die in neonatal period. In addition to fatal malformations, 70% of neonatal deaths and 75% of neonatal complications are associated with premature births [6]. Complications include respiratory diseases, intracerebral hemorrhage, infections and dysplasia. Compared with full-term babies, premature babies suffer greater exposure to cerebral palsy, amblyopia, deafness and mental retardation [7]. It has been shown that prematurity was not only a major cause in neonatal deaths, but also a substantial contributor to diabetes mellitus, coronary heart disease and hypertension in adulthood [8,9]. Hence, seeking the causes and the risk factors of preterm birth is of vital importance to public health.
By the end of the twentieth century, researches on possible risk factors of premature birth were mainly focused on the socio-economics level, educational achievements, smoking status and drinking behavior during pregnancy, intrauterine infections, multiple births, parity history including abortions, still births and preterm births, genital abnormality, pregnancy-induced hypertension, risky sexual behavior, etc [10][11][12][13][14][15]. During the last decade, an increasing number of researchers have noticed the possible causality between air pollutants and the occurrence of prematurity, which lead to a large number of studies in America, Canada, Australia, Lithuania and China concerning this topic. The results suggested that exposure to air pollutants such as NO 2 , PM 10 and SO 2 during pregnancy were possibly related to premature births [16][17][18][19]. Meanwhile, cities in China such as Beijing, Taiyuan and Taibei have also conducted such studies revealing that the increased concentration of air pollutants such as NO 2 , PM 10 and SO 2 presents as a risk attributing to premature births [20][21][22]. In this study we applied the Generalized Additive Model (GAM) extended Poisson regression model to quantitatively evaluate the effects of ambient air pollutants, NO 2 , PM 10 and SO 2 , on the preterm birth by analyzing the time-series data of air pollution, meteorological factors, and preterm births in Guangdong Province in 2007.

Data collection
Guangzhou, lied in the south of China, composed of ten districts and two satellite cities, has an urban area of over 7,434,400,000 m 2 and a metropolitan area population of 9.8 million. We obtained the information of all the live births in Guangzhou in 2007, by using the birth registry database which covers all the obstetric clinics in Guangzhou. There are a total of 142,312 births in Guangzhou City from 2007 January 1 st till 2007 December 31 st , including 9,083 (6.38%) preterm births. Gestational age was computed as the number of weeks between the date of the last menstrual period (LMP) and the date of birth [23]. Eligible births with gestational ages <37 weeks were considered preterm. Twin pregnancy and multiple pregnancies were excluded from this study. After exclusions, 7,836 of 9,083 preterm births (86.27%) were defined as the analytic units of the study. The number of preterm births was tallied for each day in 2007.
The data for the daily mean concentrations of air pollutants nitrogen dioxide (NO 2 ), particulate matter less than or equal to 10 microns (PM 10 ) and sulfur dioxide (SO 2 ) in 2007 were collected from the Environmental Monitoring Center of Guangzhou city. The daily concentrations of each pollutant were averaged from the available monitoring results of nine fixed-site stations located in the urban areas of Guangzhou, which were monitored by the China National Quality Control. We collected the 24-hour average concentrations for PM 10 , SO 2 and NO 2 by applying a selection criterion that at least 75% of all the one-hour values in a given day are available.
To allow the adjustment for the possible influences of weather on preterm birth, daily average temperature (°C) and relative humidity (%) data were collected from Guangzhou Meteorological Bureau. The weather data were measured at a fix-site station located in Yuexiu District of Guangzhou.

Statistical analysis
Given the total population, daily premature birth is relatively an event with small probabilities on the demographical scale. As a typical time-series data, its distribution approximately follows the Poission distribution [16]. To determine the influence of air pollution on premature birth, analysis should be carried out in time-series Generalized Additive Model (GAM) extended Poisson regression [24], which expands the traditional Generalized Log-Linear Model. In addition to fitting common linear subjects, complicated non-linear variables of induced variables were incorporated in different functions of additive operations. The non-parametric flexibility of GAMs has resulted in their widespread use in time-series studies to adjust for the nonlinear confounding effects of seasonality and trend [25][26][27][28][29][30][31][32]. Since introduced by Schwartz J in 1996 [33], time-series Generalized Additive Model (GAM) extended Poisson regression has become a standard method to conduct air pollution researches in environmental epidemiology. The formula is explained in detail as follows: Zt S(time df) S(temperature df) S(relative , + In the formula, Yt-t-represents daily number of preterm babies, E(Yt)-t-expected value of daily number of preterm babies, α-residual, β-regression coefficient, Zt-t-concentration of air pollution or accumulated average concentration over several days, S(time, df)-calendar time smoothing spline function, S(temperature, df)-temperature smoothing spline function, S(relative humidity, df)-relative humidity smoothing spline function, DOW (day of week)-dummy variable.
In this study, we firstly built basic models on the daily numbers of preterm births without analyzing the air pollution variables. We approached with the smoothing spline functions while incorporating time-independent variables, including calendar time, temperature and relative humidity, to control for the nonlinear confounding effects of trend, seasonality and weather [34]. This can accommodate non-linear and non-monotonic patterns between preterm birth and time/weather conditions, and thus we created a flexible modeling tool [35]. Meanwhile, dummy variable was also used to control the effects of "day of the week" (DOW). Residuals of each model were examined to check whether there were discernible patterns and autocorrelation by means of residual plots and partial autocorrelation function plots, respectively [23].
After the establishment of basic models, we added the pollutant variables into the models and analyzed their effects on preterm births. The number of gestations at risk for preterm birth was used as an offset. Generalized cross-validation (GCV) scores were used to compare the relative quality of the incidence of preterm predictions across these non-nested models and verify how well the models fit the data [35]. Taking into considerations the delayed effects of air pollutants, the model also included lag effects. The purpose of the study was to investigate the effects of air pollutants on health over a short-time period, and we applied the criteria of including the days before 7 lag day, based on literature review [36,37]. Lag day is specifically used in time-series analysis. We compared the health index of day 0 and previous days, by analyzing the different level of air pollution concentration, and also used this model to predict the health impacts on the future [38]. We introduced the concentration of air pollutants on day 0, one day ago, seven days ago (Lag0-Lag7) or lag moving average (Avg0-Avg7) into the model one by one to calculate the relative risk and CI by the regression coefficient β of air pollutants to make it possible to quantify the influence of air pollutants on premature birth. Moreover, sensitivity analysis was also conducted within the established models. While carrying out the sensitivity test, multiple air pollutants model was fitted to evaluate the stability of single air pollutant model and to compare lag and cumulative effects to analyze the stability of air pollutants' effects.
We also introduced "Principal Component" into the study and we established the dose-response model of multiple ambient air pollutants' health effects in order to exclude the impacts of co linearity [39]. The composite latent variable (Principal Component) suitable for variable information of original air pollutants by principal component analysis was substituted into the timeseries Generalized Additive Model (GAM) extended Poisson regression model. The variables were also fitted into the linear model of principal components. Meanwhile, we transformed the regression coefficient β of the principal components into the regression coefficient b of the original air pollutants, and at last we calculated the relative risk and CI so as to quantify the influence of each air pollutant on preterm birth in the multiple air pollutants model.
All the above statistical analyses were conducted by using R 2.9.0.

Results
Descriptive statistical results of premature births Figure 1 shows the variations in preterm births over the entire study period. The average of preterm birth was 21.47 every day in 2007, the quartile range was 8, P 0 , P 25 , P 50 , P 75 , P 100 was 7.00, 17.00, 21.00, 25.00, 39.00 respectively.

Correlation analysis of ambient air pollution and Meteorological factors indexes
Pearson correlation analysis in Table 2 indicated that temperature and relative humidity were negatively associated with all three air pollutants. The absolute r value of NO 2 , PM 10 and SO 2 to temperature was 0.2799, 0.1868 and 0.1650 respectively, and to relative humidity, 0.2126, 0.2002 and 0.0577. The results suggested that with a decrease in temperature and relative humidity, the concentration of air pollutants would increase. The correlation of three air pollutants indicated a strong statistical significance (P < 0.01) with the strongest correlation being between NO 2 and PM 10 (r value is 0.8533), and then between NO 2 and SO 2 (r value is 0.8440). As for temperature and relative humidity, the r value was 0.1924 and statistically significant (P < 0.01). Correlation analysis of the above indexes showed possible co linearity for the independent variables.

Lag effects and cumulative effects using a single air pollutant model
Starting with the single air pollutant model and aiming to define the strongest time concentration of health effect, analytical results of the lag effect (Table 3, 4 and 5) showed that NO 2 , PM 10 and SO 2 had statistical significance (P < 0.05) only on day 0; analytical results of the cumulative effects (Table 3, 4 and 5) showed that NO 2 and SO 2 had their strongest cumulative effects on day 3 while PM 10 on day 4, all with statistical significance (P < 0.05).
After comparing the results of lag effects and cumulative effects, we found that lag effects did not last for a long time with only day 0 resulting in statistical significance (P < 0.05). In regard to cumulative effects, the time spans of NO 2 and PM 10 were consistent each other appearing on day 0, day 3 and day 4; the strongest effect of NO 2 was on day 3 while the strongest effect of PM 10  was on day 4, and they all have statistical significance (P < 0.05). When it came to the cumulative effects of SO 2 , the effects were maintained from day 0 to day 4 with the maximum effect on day 3 (P < 0.05).

Model sensitivity analysis
While comparing the models of multiple and single air pollutants, the point estimate for air pollutants' effects generally decreased and showed no statistical significance, which was probably related to the strong co linearity between different air pollutants, or due to the fact that multiple air pollutants model would increase the standard error for model fitting, and lead to lower statistical significance.
After comparing the results of lag effects and cumulative effects, it was shown that lag effect could only last for a short time. In other words, effects could only maintain on day 0, while the time span for cumulative effects lasted longer with an increasing trend of point estimate of risk, reaching its maximum on day 3 approximately, indicating that cumulative effects model was more sensitive than that of lag effects.

Principal component time-series GAM analysis of multiple air pollutants
In order to exclude the co linearity between different air pollutants in multiple model, concentration of multiple air pollutants on day 0 and on the day with the strongest cumulative effects were analyzed by GAM model after being adjusted by principal component, and we also compared the results before and after adjusting.
2.5.1 Principal component time-series GAM analysis of effects on day 0 using multiple air pollutants model We could see from Table 6 that the eigenvalue of the first principal component was 2.6317 (> 1), providing 87.72% composite information; coefficients of different air pollutants were all positive values and close to each other. Therefore, it was considered that the first principal component constituted the composite index to  reflect air pollution. The linear model was fitted by adding it into the GAM model. Table 7 displayed the GAM model results for the effects on day 0 of different air pollutants combinations. After adjusting the collinearity within air pollutants index by principal component analysis, double and triple air pollutants models of NO 2 and PM 10 had a transformation from non-statistical to statistical significance (P < 0.05) In the triple model, after the adjustment by principal component analysis, the relative risk of NO 2 influencing  We could see from Table 8 that the eigenvalue of the first principal component was 2.6592 (> 1), providing 88.64% composite information, and coefficients of different air pollutants were all positive value and close to each other; Therefore it is considered that the first principal component constituted the composite index to reflect air pollution. The linear model was fitted by adding it into the GAM model.
We could see from Table 9 that the strongest cumulative effects of different air pollutant combinations in the GAM model, only after adjustment of co linearity of the air pollutant indexes by principal component analysis indicate a transformation from non-statistical to statistical significance for both double model and triple model.

Discussion
The results of the study have shown that the concentrations of main ambient air pollutants such as NO 2 , PM 10 and SO 2 were associated with preterm births. Analysis based on the single air pollutant model indicated that cumulative effects of NO 2 , PM 10 and SO 2 reached the peak value on the 3 rd day, 4 th day and 3 rd day respectively. An increased concentration of 100 μg/m 3 of air pollutants NO 2 , PM 10 , SO 2 on day 0, corresponded to RR of 1.0542 (95%CI: 1.0080~1.1003), 1.0688 (95%CI: 1.0074~1.1301), 1.1298 (95%CI: 1.0480~1.2116) respectively. As for the three pollutant model analysis, after adjusting the collinearity by principal component analysis, an increased concentration of 100 μg/m 3 led to RR values on day 0 as 1.0185 (95%CI: 1.0056~1.0313), 1.0215 (95%CI: 1.0066 1.0365), 1.0326 (95%CI: 1.0101~1.0552) for NO 2 , PM 10 and SO 2 respectively; RR for the strongest cumulative effects were presented as 1.0219 (95%CI:1.0053~1.0386), 1.0274 (95%CI:1.0066~1.0482), 1.0388 (95%CI: 1.0096 1.0681). The extent of risk resulted in this study was relatively lower than that of the research results by Sagiv [16], Liu [17], Hansen [18], Maroziene [19], Xu [20], Zhang [21] and Tsai [22]. There would be several possible causes, such as the different levels of ambient air pollution, population susceptibility, particular components and research methods, and further studies are required.
Currently the most popular were single air pollutant model involving only one single index without considering the inner link between different indexes, which could lead to in certain limitations when using the single air pollutant model. If multiple air pollutant indexes were directly fitted into the model, co linearity  could inevitably confound the model due to the nonindependence between different air pollutant indexes, thus leading to the instability of the model. In order to solve the problem, principal component analysis was adopted to adjust the co linearity of different air pollutant indexes in the multiple air pollutants model. Principal Component analysis is a multivariate statistical analysis method that combines three air pollutant indexes by means of an appropriate linear model, and then it generated an independent and specific composite latent variable (Principal Component) with extracted variation information of original index to establish equation of linear regression of the logarithmic latent variable and dependent variable, and as a result, the latent variable was converted into the original independent variable. By doing this, not only did the evaluation of the composite effects of different air pollutants become easier, but also the issue of co linearity among multiple air pollutants was resolved.
Mechanisms of the effect of air pollution on prematurity remain unclear. Recent studies indicated preterm delivery may be caused by inflammatory reactions, immune reactions and endocrine adjustments. Some suggested that air pollution could activate the fetal hypothalamic-pituitary-adrenal axis (HPAA), which can stimulate uterine contractions and premature rupture of fetal membranes and consequently cause preterm birth [40]. Study conducted by Peters [41] and other researchers in 1997 suggested that exposure to PM 10 and SO 2 during late pregnancy could cause inflammation, thus changing the blood viscosity and cause preterm birth. Knotternus [42] and others also found that inflammation could also affect placenta hypoperfusion and as a result induce preterm birth.
In order to quantitatively evaluate the interactions between air pollution and preterm birth, a time-series Generalized Additive Model (GAM) extended Poisson regression was applied. It has been widely applied in evaluating the health effect of air pollution. Since the relationship between air pollution and preterm birth could be confounded by time related variables, it is difficult to allow for evaluation when applying simple linear relationship model. This study was conducted with nonparametric smooth function to control the confounding factors such as time trends, season and weather, which provides a more powerful way of evaluating the relationship than traditional methods.
Sensitivity testing of the model suggests that due to the co linearity, point estimate of each pollutant index in the multiple air pollutants model generally went down in terms of risk effects and showed no statistical significance and instability. Comparison analysis between lag effects and cumulative effects indicated a short-term effect for the former, therefore, cumulative effects was more sensitive than lag effects. Given this observation, time-series Generalized Additive Model  (GAM) extended Poisson regression combined with Principal Component regression analysis was conducted in the study. This model could not only control the confounding factors such as long-time trends, season and meteorological factors, but also resolve the co linearity among different air pollutants. In addition, effects on day 0 of multiple air pollutants and the strongest cumulative effects model were also incorporated so as to evaluate the health effects of various air pollutants on preterm birth in a comprehensive manner. The limitation of this study was that outdoor air pollution data were from fixed monitoring locations. It might underestimate the impact of air pollution when an air pollution monitoring data was used to represent individual exposure level [43]. In addition, the time span of this study was only one year which might not be long enough to see all effects. The data were not analyzed on season and age groups, so it did not fully take into account the effect of seasonal variations as well as age difference of susceptibilities toward air pollution. Another limitation of this study was that we only studied NO 2 , PM 10 , SO 2 , by not the impact of the other pollutants as CO and O 3 . Because of the correlations between the pollutants, we cannot conclude that the preterm were caused by the three pollutants in our study or rule out the possibility of some other deleterious air pollutants.
Despite the limitations of research data, the result of this study indicated positive effects of the NO 2 , PM 10 and SO 2 on preterm birth risk. Although the absolute increase of risk is relatively small, we still need to take into accounts that the air pollution is a long term public health challenge, as everyday and everyone is being exposed to it, especially the pregnant women, thousands of pregnant women could have been exposed to high levels of air pollution in a long-term period. Therefore, the public health significance can not be ignored. Studies regarding the impact of air pollution on preterm deliveries are still rare in China. This study explored the potential exposure-reaction between preterm birth and air pollutants such as NO 2 , PM 10 and SO 2 and aimed to provide scientific tools and facts to help relevant departments in their decision-making regarding air pollution control, and we also expect more studies in the upcoming years which could be inspired by our study and in the long run could help reduce the adverse maternal outcomes.

Conclusions
In summary, this paper has examined that the concentrations of the NO 2 , PM 10 and SO 2 of air pollutants contributed to occurrence of preterm birth in Guangzhou city, and has shown that the three air pollutants have dose-response reactions in terms of neonatal prematurity, through analyzing a single air pollutant model and a multiple air pollutants GAM model. Although there were limitations in this study, it provided the fact that air pollution plays a non-neglectable role in prematurity. Thus, it highlights the importance of policy-makers making decisions to control air pollution and decrease rate of preterm birth.