Skip to content


  • Hypothesis
  • Open Access
  • Open Peer Review

Selection of genes for gene-environment interaction studies: a candidate pathway-based strategy using asthma as an example

  • 1, 2Email author,
  • 2, 3,
  • 4, 5,
  • 1, 2,
  • 2, 3 and
  • 1, 2
Contributed equally
Environmental Health201312:56

  • Received: 3 April 2013
  • Accepted: 2 July 2013
  • Published:
Open Peer Review reports



The identification of gene by environment (GxE) interactions has emerged as a challenging but essential task to fully understand the complex mechanism underlying multifactorial diseases. Until now, GxE interactions have been investigated by candidate approaches examining a small number of genes, or agnostically at the genome wide level.

Presentation of the hypothesis

In this paper, we propose a gene selection strategy for investigation of gene-environment interactions. This strategy integrates the information on biological processes shared by genes, the canonical pathways to which they belong and the biological knowledge related to the environment in the gene selection process. It relies on both bioinformatics resources and biological expertise.

Testing the hypothesis

We illustrate our strategy by considering asthma, tobacco smoke as the environmental exposure, and genes sharing the same biological function of “response to oxidative stress”. Our filtering strategy leads to a list of 28 pathways involving 182 genes for further GxE investigation.

Implications of the hypothesis

By integrating the environment into the gene selection process, we expect that our strategy will improve the ability to identify the joint effects and interactions of environmental and genetic factors in disease.


  • Gene by environment interactions
  • Oxidative stress
  • Smoking
  • Pathway-based gene selection


Until recently, gene by environment (GxE) interaction studies were performed by means of candidate approaches including only a small number of genes. Gene selection in candidate studies relies on 1) known functions of gene sets sharing biological processes, and/or functionally interacting within biological networks; or 2) the mode of action of the environmental factors through relevant pathways in which genes are involved[1]. With the advent of high-throughput genotyping technologies, GxE interactions are starting to be explored at the genome wide level but this approach involves the following difficulties: 1) the heterogeneity of environmental exposures; 2) the “agnostic” nature of the genome-wide approach, which does not make use of prior knowledge on biological processes and/or pathways; and 3) the requirement of stringent thresholds to declare an GxE interaction significant because of the very large number of statistical tests conducted[2].

In this scenario, the classical candidate gene approach can be extended to the selection of large sets of genes. In this paper, we propose a strategy for obtaining a large gene set that integrates the information on biological processes shared by genes, the canonical pathways to which they belong and the biological knowledge related to the environmental exposure studied in the gene selection process.

The asthma example

Asthma is a complex heterogeneous multifactorial disorder resulting from genetic and environmental factors[3] and whose etiology remains poorly understood. The increase in asthma prevalence in recent decades has led to extensive research regarding the environmental determinants that may have changed over the last 30 years. There have also been considerable efforts to characterize the genetic determinants of asthma, including candidate gene studies, genome-wide linkage screens followed by positional cloning studies and more recently genome-wide association studies (GWAS)[4]. Although these studies have been successful in identifying novel loci, the genetic factors identified explain only a small part of the genetic component of asthma. One of the reasons is that many genetic factors are likely to be involved in the development, the activity and the severity of asthma. Furthermore, they act primarily through complex mechanisms that involve interactions with environmental factors, or with other genes through pathways or networks. The effect of such genetic factors may be missed if their interactions with the environment are not taken into account, or if genes are considered alone, regardless of the biological functions they shared or the pathways they are involved in[5]. Overall, understanding the mechanisms through which genes and the environment interact represents one of the major challenges for pulmonary researchers. The first Genome-Wide Environment Interaction Study (GWEIS) in asthma[6] identified no statistically significant interaction at the genome-wide level, not even with Single Nucleotide Polymorphisms (SNPs), which were shown to interact with the environment in previous candidate studies.

In response to environmental exposures, adaptive responses for protection against environmental toxic insults are activated through metabolic pathways. Among the several metabolic pathways that could be investigated in asthma, the response to oxidative stress is of major interest: the amount of biological evidence of the role of oxidative stress in asthma is increasing[7], and tobacco smoke is related to oxidative stress. Tobacco smoke is also a risk factor for asthma. Active smoking has been found to be associated with the incidence of asthma during adolescence in a dose-dependent manner[8] and with asthma severity in asthmatic cases[9]. Regular smoking was associated with increased risk of new-onset asthma among adolescents in a prospective cohort study[10], and active smoking has a deleterious role on asthma[11]. To our knowledge, only one study focused on gene by smoking interactions on asthma in adults by considering 18 key genes involved in the same pathway: the metabolism of xenobiotics. Some of these genes were also involved in the response to oxidative stress, and SNPs in seven of them were significantly associated with the risk of asthma in adult smokers or non-smokers[12].

Presentation of the hypothesis

In this paper, we propose a strategy for selecting genes to be investigated in GxE interaction studies. This strategy involves the information on biological processes shared by the genes, the canonical pathways to which they belong to and biological knowledge related to the environment into the gene selection process. We hypothesize that this strategy will provide an expanded and enriched biologically plausible list of candidate genes for further GxE studies.

This strategy follows three successive steps (see Figure 1): 1) step 1 (gene selection): selection of a set of genes sharing a biological process known to be related with the outcome or the disease of interest, 2) step 2 (pathway enrichment): selection of physically and/or chemically related gene pathways that are enriched in genes belonging to the gene set selected in step 1. Among the pathways that constitute a biological process, we considered the signaling and/or metabolic pathways, also known as canonical pathways, which better suit the subsequent environmental integration step, and 3) step 3 (environment integration): selection of canonical pathways known to be potentially related to the environmental factor of interest among the pathways selected in step 2. The final set of genes includes the genes selected in step 1 that belong to the canonical pathways selected in step 3. Note that step 3 critically relies on the user’s own expertise.
Figure 1
Figure 1

The three-step strategy.

Testing the hypothesis

To illustrate our strategy, we consider asthma exposure to tobacco smoke as the environmental factor, and the genes involved in the response to oxidative stress.

Step 1 (gene selection)

The set of genes was obtained from the Gene Ontology (GO) database (Gene Ontology Consortium[13, 14]), as described in the online tutorial [see Additional file1]. The GO project is a bioinformatics initiative that aims at standardizing the representation of genes and gene product attributes across species and databases. The project provides a controlled vocabulary of terms for describing gene product characteristics and gene product annotation data, as well as tools to access and process this data. We used the term “response to oxidative stress” (GO:0006979) which encompasses gene products that are involved in any process that results in a change in state or activity of a cell or an organism (in terms of movement, secretion, enzyme production, gene expression, etc.) as a result of oxidative stress, a state often resulting from exposure to high levels of reactive oxygen species, e.g. superoxide anions, hydrogen peroxide, and hydroxyl radicals. We obtained a set of 387 genes, including all genes previously investigated in candidate GxE interaction studies in respiratory epidemiology such as MPO, CAT, GCLM, GCLC, GSTP1, NQO1[1521], and some genes in the study by Polonikov et al.[12]. We further enlarged the gene set by using our own expertise, GWAS literature reviews, and biological studies[2226]. A total of 411 genes were then considered for the next step.

Step 2 (pathway enrichment)

This step consists in identifying canonical pathways that contain a statistically significant excess of genes from the set of 411 genes selected in step 1. This pathway analysis can be conducted by using several tools such as Ingenuity Pathway Analysis (IPA,[27]) or Gene Set Enrichment Analysis (GSEA[28, 29]). These software solutions differ in terms of the biological databases they rely on (KEGG, Biocarta, Reactome, Pubmed, STRING…) and the methods used to assess the statistical significance of the pathways.

All gene symbols were recognized by IPA but not by GSEA (390 out of 411). IPA gave 277 canonical pathways that contained at least 5 of the set of 411 genes selected in step 1 and which were significantly enriched in these genes (p < 0.05). IPA P-values for pathway enrichment testing were obtained with Fisher’s exact tests, with a Benjamini–Hochberg correction for multiple testing determined by the ratio of the number of genes from the gene set to the total number of genes in the pathways from the IPA library. GSEA provided no more than the top 100 canonical pathways (p < 1.06 10-12). Comparing the results provided by both software packages is difficult as the names of the pathways and the genes involved in them are not standardized. Therefore, we decided to perform the third step with the largest list of pathways and genes i.e. the 277 pathways obtained from IPA.

Step 3 (environment integration)

Based on our own expertise, we selected the canonical pathways identified at step 2 that are involved in tobacco smoke metabolism, thus allowing the step 1-gene set to be filtered. Among the 277 canonical pathways identified in step 2, we selected 28 of them (pathway enrichment P-values ranging from 2.63x10-2 to 1.58x10-31) [see Additional file2: Table S1 and Table S2]. These 28 pathways included from 5 up to 47 genes (15–20 genes on average), 61% of them being involved in more than one pathway. Two hundred and twenty-nine genes from the initial set of 411 genes did not map to any of the selected pathways and were dropped, leading to a final set of 182 genes (Table 1).
Table 1

Distribution of the 182 genes by canonical pathways involved in the tobacco smoke metabolism

Canonical pathways


N of genes

NRF2-mediated Oxidative Stress Response



Glutathione Redox Reactions I



Xenobiotic Metabolism Signaling



Aryl Hydrocarbon Receptor Signaling



Mitochondrial Dysfunction



Glutathione-mediated Detoxification



Production of Nitric Oxide and Reactive Oxygen Species in Macrophages



Acute Phase Response Signaling



Antioxidant Action of Vitamin C



IL-8 Signaling



Apoptosis Signaling



Superpathway of Citrulline Metabolism



Superoxide Radicals Degradation



IL-6 Signaling



iNOS Signaling



VEGF Signaling



fMLP Signaling in Neutrophils



Chemokine Signaling



VEGF Family Ligand-Receptor Interactions



NF-KB Signaling



CCR5 Signaling in Macrophages



IL-17A Signaling in Airway Cells



Nucleotide Excision Repair Pathway



IL-1 Signaling



Nicotine Degradation II



Nicotine Degradation III



CCR3 Signaling in Eosinophils



eNOS Signaling



*P-values for pathway enrichment testing as calculated by IPA.

Implications of the hypothesis

The candidate pathway-based strategy described here was able to select a large number of candidate genes to be tested for interaction with tobacco on asthma. This filtering strategy exploits recent developments in bioinformatics resources that are originally combined with the literature and our own expertise on the metabolism of compounds related to a given environmental factor. This filtering strategy could be applied to other environmental factors related to oxidative stress and asthma, such as outdoor air pollutants or the metabolism of cleaning agents. Together with an expanded and enriched list of candidate genes, the interest of such an approach is also dependent on accurate assessment of environmental exposure. Interestingly, the same list of genes can be used for GxE studies on other diseases characterized by oxidative stress and tobacco smoke, such as lung cancer. By appropriately integrating the knowledge of the environmental factor into the gene selection, we expect that the strategy proposed here will improve the ability to identify the joint effects and interactions of environmental and genetic factors, and will contribute to a better understanding of the etiology of complex diseases.




Gene by environment


Gene ontology


Gene set enrichment analysis


Genome-wide association studies


Genome-wide environment interaction study


Ingenuity pathway analysis


Single nucleotide polymorphism.



Research funded in part by Agence Nationale de la Recherche (ANR) (ANR- 2010-PRSP-003, and the Large-Scale Genome-Wide Association Study of Asthma (GABRIEL), a multidisciplinary study to identify the genetic and environmental causes of asthma in the European Community (contract 018996 from the European Commission).

Authors’ Affiliations

Inserm, Centre for research in Epidemiology and Population Health (CESP), U1018, Respiratory and Environmental Epidemiology Team, F-94807 Paris, Villejuif, France
University Paris-Sud, UMRS 1018, F-94807 Paris, Villejuif, France
Inserm, Centre for research in Epidemiology and Population Health (CESP), U1018, Biostatistics Team, F-94807 Paris, Villejuif, France
Inserm, U946, F-75010 Paris, France
Institut Universitaire d’Hématologie, University Paris Diderot, Sorbonne Paris Cité, F-75007 Paris, France


  1. Kauffmann F, Nadif R: Candidate gene-environment interactions. J Epidemiol Community Health. 2010, 64: 188-189. 10.1136/jech.2008.086199.View ArticleGoogle Scholar
  2. Ober C, Vercelli D: Gene-environment interactions in human disease: nuisance or opportunity?. Trends in genetics: TIG. 2011, 27: 107-115. 10.1016/j.tig.2010.12.004.View ArticleGoogle Scholar
  3. Von Mutius E: Gene-environment interactions in asthma. J Allergy Clin Immunol. 2009, 123: 3-11. 10.1016/j.jaci.2008.10.046.View ArticleGoogle Scholar
  4. Holloway JW, Yang IA, Holgate ST: Genetics of allergic disease. J Allergy Clin Immunol. 2010, 125 (2 Suppl 2): 81-94.View ArticleGoogle Scholar
  5. Liu C, Maity A, Lin X, Wright RO, Christiani DC: Design and analysis issues in gene and environment studies. Environ Health global access scie source. 2012, 11: 93-Google Scholar
  6. Ege MJ, Strachan DP, Cookson WOCM, Moffatt MF, Gut I, Lathrop M, Kabesch M, Genuneit J, Büchele G, Sozanska B, Boznanski A, Cullinan P, Horak E, Bieli C, Braun-Fahrländer C, Heederik D, Von Mutius E: Gene-environment interaction for childhood asthma and exposure to farming in Central Europe. J Allergy Clin Immunol. 2011, 127: 1-4. 10.1016/j.jaci.2010.11.027. 138–44, 144.eView ArticleGoogle Scholar
  7. Chung KF, Marwick JA: Molecular mechanisms of oxidative stress in airways and lungs with reference to asthma and chronic obstructive pulmonary disease. Ann N Y Acad Sci. 2010, 1203: 85-91. 10.1111/j.1749-6632.2010.05600.x.View ArticleGoogle Scholar
  8. Genuneit J, Weinmayr G, Radon K, Dressel H, Windstetter D, Rzehak P, Vogelberg C, Leupold W, Nowak D, Von Mutius E, Weiland SK: Smoking and the incidence of asthma during adolescence: results of a large cohort study in Germany. Thorax. 2006, 61: 572-578. 10.1136/thx.2005.051227.View ArticleGoogle Scholar
  9. Siroux V, Pin I, Oryszczyn MP, Le Moual N, Kauffmann F: Relationships of active smoking to asthma and asthma severity in the EGEA study. Epidemiological study on the Genetics and Environment of Asthma. Eur Respir J. 2000, 15: 470-477. 10.1034/j.1399-3003.2000.15.08.x.View ArticleGoogle Scholar
  10. Gilliland FD, Islam T, Berhane K, Gauderman WJ, McConnell R, Avol E, Peters JM: Regular smoking and asthma incidence in adolescents. Am J Respir Crit Care Med. 2006, 174: 1094-1100. 10.1164/rccm.200605-722OC.View ArticleGoogle Scholar
  11. Vignoud L, Pin I, Boudier A, Pison C, Nadif R, Le Moual N, Slama R, Makao MN, Kauffmann F, Siroux V: Smoking and asthma: disentangling their mutual influences using a longitudinal approach. Respir Med. 2011, 105: 1805-1814. 10.1016/j.rmed.2011.07.005.View ArticleGoogle Scholar
  12. Polonikov AV, Ivanov VP, Solodilova MA: Genetic variation of genes for xenobiotic-metabolizing enzymes and risk of bronchial asthma: the importance of gene-gene and gene-environment interactions for disease susceptibility. J Hum Genet. 2009, 54: 440-449. 10.1038/jhg.2009.58.View ArticleGoogle Scholar
  13. Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, Harris MA, Hill DP, Issel-Tarver L, Kasarskis A, Lewis S, Matese JC, Richardson JE, Ringwald M, Rubin GM, Sherlock G: Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet. 2000, 25: 25-29. 10.1038/75556.View ArticleGoogle Scholar
  14. The Gene Ontology database, version 1.8., Date Accessed: 12/2012
  15. Islam T, Berhane K, McConnell R, Gauderman WJ, Avol E, Peters JM, Gilliland FD: Glutathione-S-transferase (GST) P1, GSTM1, exercise, ozone and asthma incidence in school children. Thorax. 2009, 64: 197-202. 10.1136/thx.2008.099366.View ArticleGoogle Scholar
  16. Islam T, McConnell R, Gauderman WJ, Avol E, Peters JM, Gilliland FD: Ozone, oxidant defense genes, and risk of asthma during adolescence. Am J Respir Crit Care Med. 2008, 177: 388-395. 10.1164/rccm.200706-863OC.View ArticleGoogle Scholar
  17. Castro-Giner F, Künzli N, Jacquemin B, Forsberg B, De Cid R, Sunyer J, Jarvis D, Briggs D, Vienneau D, Norback D, González JR, Guerra S, Janson C, Antó JM, Wjst M, Heinrich J, Estivill X, Kogevinas M: Traffic-related air pollution, oxidative stress genes, and asthma (ECHRS). Environ Health Perspect. 2009, 117: 1919-1924.View ArticleGoogle Scholar
  18. Rogers AJ, Brasch-Andersen C, Ionita-Laza I, Murphy A, Sharma S, Klanderman BJ, Raby BA: The Interaction of Glutathione S-transferase M1-null Variants with Tobacco Smoke Exposure and the Development of Childhood Asthma. Clin Exp Allergy. 2009, 39: 1721-1729. 10.1111/j.1365-2222.2009.03372.x.View ArticleGoogle Scholar
  19. Salam MT, Islam T, Gauderman WJ, Gilliland FD: Roles of arginase variants, atopy, and ozone in childhood asthma. J Allergy Clin Immunol. 2009, 123: 1-8. 10.1016/j.jaci.2008.11.030. 596–602, 602View ArticleGoogle Scholar
  20. Wenten M, Gauderman WJ, Berhane K, Lin PC, Peters J, Gilliland FD: Functional variants in the catalase and myeloperoxidase genes, ambient air pollution, and respiratory-related school absences: an example of epistasis in gene-environment interactions. Am J Epidemiol. 2009, 170: 1494-1501. 10.1093/aje/kwp310.View ArticleGoogle Scholar
  21. Breton CV, Salam MT, Vora H, Gauderman WJ, Gilliland FD: Genetic variation in the glutathione synthesis pathway, air pollution, and children’s lung function growth. Am J Respir Crit Care Med. 2011, 183: 243-248. 10.1164/rccm.201006-0849OC.View ArticleGoogle Scholar
  22. Elliott NA, Volkert MR: Stress induction and mitochondrial localization of Oxr1 proteins in yeast and humans. Mol Cell Biol. 2004, 24: 3180-7. 10.1128/MCB.24.8.3180-3187.2004.View ArticleGoogle Scholar
  23. Kaimul Ahsan M, Nakamura H, Tanito M, Yamada K, Utsumi H, Yodoi J: Thioredoxin-1 suppresses lung injury and apoptosis induced by diesel exhaust particles (DEP) by scavenging reactive oxygen species and by inhibiting DEP-induced downregulation of Akt. Free Radic Biol Med. 2005, 39: 1549-1559. 10.1016/j.freeradbiomed.2005.07.016.View ArticleGoogle Scholar
  24. Nickel C, Trujillo M, Rahlfs S, Deponte M, Radi R, Becker K: Plasmodium falciparum 2-Cys peroxiredoxin reacts with plasmoredoxin and peroxynitrite. Biol Chem. 2005, 386: 1129-1136.View ArticleGoogle Scholar
  25. Tomita M, Okuyama T, Katsuyama H, Hidaka K, Otsuki T, Ishikawa T: Gene expression in rat lungs during early response to paraquat-induced oxidative stress. Int J Mol Med. 2006, 17: 37-44.Google Scholar
  26. Tseng CF, Huang HY, Yang YT, Mao SJT: Purification of human haptoglobin 1–1, 2–1, and 2–2 using monoclonal antibody affinity chromatography. Protein Expr Purif. 2004, 33: 265-273. 10.1016/j.pep.2003.09.006.View ArticleGoogle Scholar
  27. IPA: Ingenuity® Systems., Date Accessed: 01/2013
  28. Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, Paulovich A, Pomeroy SL, Golub TR, Lander ES, Mesirov JP: Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci USA. 2005, 102: 15545-15550. 10.1073/pnas.0506580102.View ArticleGoogle Scholar
  29. Molecular Signatures Database v3.1, updated Sep 27. 2012,, Date Accessed: 05/2013


© Rava et al.; licensee BioMed Central Ltd. 2013

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate. Please note that comments may be removed without notice if they are flagged by another user or do not comply with our community guidelines.