RESUMO
MOTIVATION: Genome-wide association studies are beginning to elucidate how our genetic differences contribute to susceptibility and severity of disease. While computational tools have previously been developed to support various aspects of genome-wide association studies, there is currently a need for informatics solutions that facilitate the integration of data from multiple sources. RESULTS: Here we present GWAS Analyzer, a database driven web-based tool that integrates genotype and phenotype data, association analysis results and genomic annotations from multiple public resources. GWAS Analyzer contains features for browsing these interrelated data, exploring phenotypic values by family or genotype, and filtering association results based on multiple criteria. The utility of the tool has been demonstrated by a genome-wide association study of human in vitro susceptibility to bacterial infection. GWAS Analyzer facilitated management of large sets of phenotype and genotype data, analysis of phenotypic variation and heritability, and most importantly, generation of a refined set of candidate single nucleotide polymorphisms (SNPs). The tool revealed a SNP that was experimentally validated to be associated with increased cell death among Salmonella infected HapMap cell lines.
Assuntos
Biologia Computacional/métodos , Genoma , Genótipo , Fenótipo , Software , Bases de Dados Genéticas , Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo ÚnicoRESUMO
Recent progress in cataloguing common genetic variation has made possible genome-wide studies that are beginning to elucidate the causes and consequences of our genetic differences. Approaches that provide a mechanistic understanding of how genetic variants function to alter disease susceptibility and why they were substrates of natural selection would complement other approaches to human-genome analysis. Here we use a novel cell-based screen of bacterial infection to identify human variation in Salmonella-induced cell death. A loss-of-function allele of CARD8, a reported inhibitor of the proinflammatory protease caspase-1, was associated with increased cell death in vitro (p = 0.013). The validity of this association was demonstrated through overexpression of alternative alleles and RNA interference in cells of varying genotype. Comparison of mammalian CARD8 orthologs and examination of variation among different human populations suggest that the increase in infectious-disease burden associated with larger animal groups (i.e., herds and colonies), and possibly human population expansion, may have naturally selected for loss of CARD8. We also find that the loss-of-function CARD8 allele shows a modest association with an increased risk of systemic inflammatory response syndrome in a small study (p = 0.05). Therefore, a by-product of the selected benefit of loss of CARD8 could be increased inflammatory diseases. These results demonstrate the utility of genome-wide cell-based association screens with microbes in the identification of naturally selected variants that can impact human health.
Assuntos
Infecções Bacterianas/genética , Variação Genética , Genoma Humano , Estudo de Associação Genômica Ampla , Fenômenos do Sistema Imunitário , Alelos , Proteínas Adaptadoras de Sinalização CARD/genética , Genética Populacional , Genótipo , Humanos , Proteínas de Neoplasias/genética , Polimorfismo de Nucleotídeo Único , Salmonella typhimurium/genética , Salmonella typhimurium/metabolismoRESUMO
BACKGROUND: The conservation of gene order among prokaryotic genomes can provide valuable insight into gene function, protein interactions, or events by which genomes have evolved. Although some tools are available for visualizing and comparing the order of genes between genomes of study, few support an efficient and organized analysis between large numbers of genomes. The Prokaryotic Sequence homology Analysis Tool (PSAT) is a web tool for comparing gene neighborhoods among multiple prokaryotic genomes. RESULTS: PSAT utilizes a database that is preloaded with gene annotation, BLAST hit results, and gene-clustering scores designed to help identify regions of conserved gene order. Researchers use the PSAT web interface to find a gene of interest in a reference genome and efficiently retrieve the sequence homologs found in other bacterial genomes. The tool generates a graphic of the genomic neighborhood surrounding the selected gene and the corresponding regions for its homologs in each comparison genome. Homologs in each region are color coded to assist users with analyzing gene order among various genomes. In contrast to common comparative analysis methods that filter sequence homolog data based on alignment score cutoffs, PSAT leverages gene context information for homologs, including those with weak alignment scores, enabling a more sensitive analysis. Features for constraining or ordering results are designed to help researchers browse results from large numbers of comparison genomes in an organized manner. PSAT has been demonstrated to be useful for helping to identify gene orthologs and potential functional gene clusters, and detecting genome modifications that may result in loss of function. CONCLUSION: PSAT allows researchers to investigate the order of genes within local genomic neighborhoods of multiple genomes. A PSAT web server for public use is available for performing analyses on a growing set of reference genomes through any web browser with no client side software setup or installation required. Source code is freely available to researchers interested in setting up a local version of PSAT for analysis of genomes not available through the public server. Access to the public web server and instructions for obtaining source code can be found at http://www.nwrce.org/psat.
Assuntos
Algoritmos , Mapeamento Cromossômico/métodos , Genoma Arqueal/genética , Genoma Bacteriano/genética , Internet , Alinhamento de Sequência/métodos , Análise de Sequência de DNA/métodos , Software , Sequência de Bases , Dados de Sequência MolecularRESUMO
BACKGROUND: Francisella tularensis subspecies tularensis and holarctica are pathogenic to humans, whereas the two other subspecies, novicida and mediasiatica, rarely cause disease. To uncover the factors that allow subspecies tularensis and holarctica to be pathogenic to humans, we compared their genome sequences with the genome sequence of Francisella tularensis subspecies novicida U112, which is nonpathogenic to humans. RESULTS: Comparison of the genomes of human pathogenic Francisella strains with the genome of U112 identifies genes specific to the human pathogenic strains and reveals pseudogenes that previously were unidentified. In addition, this analysis provides a coarse chronology of the evolutionary events that took place during the emergence of the human pathogenic strains. Genomic rearrangements at the level of insertion sequences (IS elements), point mutations, and small indels took place in the human pathogenic strains during and after differentiation from the nonpathogenic strain, resulting in gene inactivation. CONCLUSION: The chronology of events suggests a substantial role for genetic drift in the formation of pseudogenes in Francisella genomes. Mutations that occurred early in the evolution, however, might have been fixed in the population either because of evolutionary bottlenecks or because they were pathoadaptive (beneficial in the context of infection). Because the structure of Francisella genomes is similar to that of the genomes of other emerging or highly pathogenic bacteria, this evolutionary scenario may be shared by pathogens from other species.