Pesquisa | Portal Regional da BVS

1.

Updates of the HbVar database of human hemoglobin variants and thalassemia mutations.

Giardine, Belinda; Borg, Joseph; Viennas, Emmanouil; Pavlidis, Cristiana; Moradkhani, Kamran; Joly, Philippe; Bartsakoulia, Marina; Riemer, Cathy; Miller, Webb; Tzimas, Giannis; Wajcman, Henri; Hardison, Ross C; Patrinos, George P.

Nucleic Acids Res ; 42(Database issue): D1063-9, 2014 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-24137000

RESUMO

HbVar (http://globin.bx.psu.edu/hbvar) is one of the oldest and most appreciated locus-specific databases launched in 2001 by a multi-center academic effort to provide timely information on the genomic alterations leading to hemoglobin variants and all types of thalassemia and hemoglobinopathies. Database records include extensive phenotypic descriptions, biochemical and hematological effects, associated pathology and ethnic occurrence, accompanied by mutation frequencies and references. Here, we report updates to >600 HbVar entries, inclusion of population-specific data for 28 populations and 27 ethnic groups for α-, and ß-thalassemias and additional querying options in the HbVar query page. HbVar content was also inter-connected with two other established genetic databases, namely FINDbase (http://www.findbase.org) and Leiden Open-Access Variation database (http://www.lovd.nl), which allows comparative data querying and analysis. HbVar data content has contributed to the realization of two collaborative projects to identify genomic variants that lie on different globin paralogs. Most importantly, HbVar data content has contributed to demonstrate the microattribution concept in practice. These updates significantly enriched the database content and querying potential, enhanced the database profile and data quality and broadened the inter-relation of HbVar with other databases, which should increase the already high impact of this resource to the globin and genetic database community.

Assuntos

Bases de Dados de Ácidos Nucleicos , Variação Genética , Hemoglobinas/genética , Mutação , Talassemia/genética , Genótipo , Humanos , Internet , Fenótipo , Talassemia/etnologia

2.

Galaxy tools to study genome diversity.

Bedoya-Reina, Oscar C; Ratan, Aakrosh; Burhans, Richard; Kim, Hie Lim; Giardine, Belinda; Riemer, Cathy; Li, Qunhua; Olson, Thomas L; Loughran, Thomas P; Vonholdt, Bridgett M; Perry, George H; Schuster, Stephan C; Miller, Webb.

Gigascience ; 2(1): 17, 2013 Dec 30.

Artigo em Inglês | MEDLINE | ID: mdl-24377391

RESUMO

BACKGROUND: Intra-species genetic variation can be used to investigate population structure, selection, and gene flow in non-model vertebrates; and due to the plummeting costs for genome sequencing, it is now possible for small labs to obtain full-genome variation data from their species of interest. However, those labs may not have easy access to, and familiarity with, computational tools to analyze those data. RESULTS: We have created a suite of tools for the Galaxy web server aimed at handling nucleotide and amino-acid polymorphisms discovered by full-genome sequencing of several individuals of the same species, or using a SNP genotyping microarray. In addition to providing user-friendly tools, a main goal is to make published analyses reproducible. While most of the examples discussed in this paper deal with nuclear-genome diversity in non-human vertebrates, we also illustrate the application of the tools to fungal genomes, human biomedical data, and mitochondrial sequences. CONCLUSIONS: This project illustrates that a small group can design, implement, test, document, and distribute a Galaxy tool collection to meet the needs of a particular community of biologists.

3.

Some phenotype association tools in Galaxy: looking for disease SNPs in a full genome.

Giardine, Belinda M; Riemer, Cathy; Burhans, Richard; Ratan, Aakrosh; Miller, Webb.

Curr Protoc Bioinformatics ; Chapter 15: 15.2.1-15.2.27, 2012 Sep.

Artigo em Inglês | MEDLINE | ID: mdl-22948727

RESUMO

This unit focuses on some of the tools available on the public Galaxy server that are useful for exploring possible associations between human genetic variants and phenotypes. We trace step-by-step through an example illustrating several methods for examining a single full-coverage genome to look for single-nucleotide polymorphisms (SNPs) that are either known to be associated with disease or suspected to have impact for other reasons. It makes use of public genomic data, tools designed specifically for working with variants, and also some general tools for text manipulation and operations on genomic coordinates.

Assuntos

Fenótipo , Polimorfismo de Nucleotídeo Único , Software , Variação Genética , Genoma Humano , Humanos

4.

Revealing mammalian evolutionary relationships by comparative analysis of gene clusters.

Song, Giltae; Riemer, Cathy; Dickins, Benjamin; Kim, Hie Lim; Zhang, Louxin; Zhang, Yu; Hsu, Chih-Hao; Hardison, Ross C; Green, Eric D; Miller, Webb.

Genome Biol Evol ; 4(4): 586-601, 2012.

Artigo em Inglês | MEDLINE | ID: mdl-22454131

RESUMO

Many software tools for comparative analysis of genomic sequence data have been released in recent decades. Despite this, it remains challenging to determine evolutionary relationships in gene clusters due to their complex histories involving duplications, deletions, inversions, and conversions. One concept describing these relationships is orthology. Orthologs derive from a common ancestor by speciation, in contrast to paralogs, which derive from duplication. Discriminating orthologs from paralogs is a necessary step in most multispecies sequence analyses, but doing so accurately is impeded by the occurrence of gene conversion events. We propose a refined method of orthology assignment based on two paradigms for interpreting its definition: by genomic context or by sequence content. X-orthology (based on context) traces orthology resulting from speciation and duplication only, while N-orthology (based on content) includes the influence of conversion events. We developed a computational method for automatically mapping both types of orthology on a per-nucleotide basis in gene cluster regions studied by comparative sequencing, and we make this mapping accessible by visualizing the output. All of these steps are incorporated into our newly extended CHAP 2 package. We evaluate our method using both simulated data and real gene clusters (including the well-characterized α-globin and ß-globin clusters). We also illustrate use of CHAP 2 by analyzing four more loci: CCL (chemokine ligand), IFN (interferon), CYP2abf (part of cytochrome P450 family 2), and KIR (killer cell immunoglobulin-like receptors). These new methods facilitate and extend our understanding of evolution at these and other loci by adding automated accurate evolutionary inference to the biologist's toolkit. The CHAP 2 package is freely available from http://www.bx.psu.edu/miller_lab.

Assuntos

Evolução Molecular , Mamíferos/genética , Família Multigênica , Proteínas/genética , Animais , Conversão Gênica , Duplicação Gênica , Genoma , Humanos , Mamíferos/classificação , Filogenia

5.

Conversion events in gene clusters.

Song, Giltae; Hsu, Chih-Hao; Riemer, Cathy; Zhang, Yu; Kim, Hie Lim; Hoffmann, Federico; Zhang, Louxin; Hardison, Ross C; Green, Eric D; Miller, Webb.

BMC Evol Biol ; 11: 226, 2011 Jul 28.

Artigo em Inglês | MEDLINE | ID: mdl-21798034

RESUMO

BACKGROUND: Gene clusters containing multiple similar genomic regions in close proximity are of great interest for biomedical studies because of their associations with inherited diseases. However, such regions are difficult to analyze due to their structural complexity and their complicated evolutionary histories, reflecting a variety of large-scale mutational events. In particular, conversion events can mislead inferences about the relationships among these regions, as traced by traditional methods such as construction of phylogenetic trees or multi-species alignments. RESULTS: To correct the distorted information generated by such methods, we have developed an automated pipeline called CHAP (Cluster History Analysis Package) for detecting conversion events. We used this pipeline to analyze the conversion events that affected two well-studied gene clusters (α-globin and ß-globin) and three gene clusters for which comparative sequence data were generated from seven primate species: CCL (chemokine ligand), IFN (interferon), and CYP2abf (part of cytochrome P450 family 2). CHAP is freely available at http://www.bx.psu.edu/miller_lab. CONCLUSIONS: These studies reveal the value of characterizing conversion events in the context of studying gene clusters in complex genomes.

Assuntos

Conversão Gênica , Família Multigênica , Primatas/genética , alfa-Globinas/genética , Globinas beta/genética , Animais , Evolução Molecular , Genoma , Humanos , Dados de Sequência Molecular , Filogenia , Primatas/classificação , Software

6.

Systematic documentation and analysis of human genetic variation in hemoglobinopathies using the microattribution approach.

Giardine, Belinda; Borg, Joseph; Higgs, Douglas R; Peterson, Kenneth R; Philipsen, Sjaak; Maglott, Donna; Singleton, Belinda K; Anstee, David J; Basak, A Nazli; Clark, Barnaby; Costa, Flavia C; Faustino, Paula; Fedosyuk, Halyna; Felice, Alex E; Francina, Alain; Galanello, Renzo; Gallivan, Monica V E; Georgitsi, Marianthi; Gibbons, Richard J; Giordano, Piero C; Harteveld, Cornelis L; Hoyer, James D; Jarvis, Martin; Joly, Philippe; Kanavakis, Emmanuel; Kollia, Panagoula; Menzel, Stephan; Miller, Webb; Moradkhani, Kamran; Old, John; Papachatzopoulou, Adamantia; Papadakis, Manoussos N; Papadopoulos, Petros; Pavlovic, Sonja; Perseu, Lucia; Radmilovic, Milena; Riemer, Cathy; Satta, Stefania; Schrijver, Iris; Stojiljkovic, Maja; Thein, Swee Lay; Traeger-Synodinos, Jan; Tully, Ray; Wada, Takahito; Waye, John S; Wiemann, Claudia; Zukic, Branka; Chui, David H K; Wajcman, Henri; Hardison, Ross C.

Nat Genet ; 43(4): 295-301, 2011 Mar 20.

Artigo em Inglês | MEDLINE | ID: mdl-21423179

RESUMO

We developed a series of interrelated locus-specific databases to store all published and unpublished genetic variation related to hemoglobinopathies and thalassemia and implemented microattribution to encourage submission of unpublished observations of genetic variation to these public repositories. A total of 1,941 unique genetic variants in 37 genes, encoding globins and other erythroid proteins, are currently documented in these databases, with reciprocal attribution of microcitations to data contributors. Our project provides the first example of implementing microattribution to incentivise submission of all known genetic variation in a defined system. It has demonstrably increased the reporting of human variants, leading to a comprehensive online resource for systematically describing human genetic variation in the globin genes and other genes contributing to hemoglobinopathies and thalassemias. The principles established here will serve as a model for other systems and for the analysis of other common and/or complex human genetic diseases.

Assuntos

Bases de Dados Genéticas , Variação Genética , Hemoglobinopatias/genética , Sequência de Bases , DNA/genética , Mineração de Dados , Genoma Humano , Hemoglobinas/genética , Projeto Genoma Humano , Humanos , Dados de Sequência Molecular , Mutação , Regiões Promotoras Genéticas , Editoração

7.

Evaluation of methods for detecting conversion events in gene clusters.

Song, Giltae; Hsu, Chih-Hao; Riemer, Cathy; Miller, Webb.

BMC Bioinformatics ; 12 Suppl 1: S45, 2011 Feb 15.

Artigo em Inglês | MEDLINE | ID: mdl-21342577

RESUMO

BACKGROUND: Gene clusters are genetically important, but their analysis poses significant computational challenges. One of the major reasons for these difficulties is gene conversion among the duplicated regions of the cluster, which can obscure their true relationships. Many computational methods for detecting gene conversion events have been released, but their performance has not been assessed for wide deployment in evolutionary history studies due to a lack of accurate evaluation methods. RESULTS: We designed a new method that simulates gene cluster evolution, including large-scale events of duplication, deletion, and conversion as well as small mutations. We used this simulation data to evaluate several different programs for detecting gene conversion events. CONCLUSIONS: Our evaluation identifies strengths and weaknesses of several methods for detecting gene conversion, which can contribute to more accurate analysis of gene cluster evolution.

Assuntos

Biologia Computacional/métodos , Conversão Gênica , Família Multigênica , Animais , Evolução Biológica , Simulação por Computador , Humanos , Primatas/genética , Alinhamento de Sequência

8.

MultiPipMaker: a comparative alignment server for multiple DNA sequences.

Elnitski, Laura; Burhans, Richard; Riemer, Cathy; Hardison, Ross; Miller, Webb.

Curr Protoc Bioinformatics ; Chapter 10: 10.4.1-10.4.14, 2010 Jun.

Artigo em Inglês | MEDLINE | ID: mdl-20521245

RESUMO

The MultiPipMaker World Wide Web server (http://www.bx.psu.edu) provides a tool for aligning multiple DNA sequences and visualizing regions of conservation among them. This unit describes its use and gives an explanation of the resulting output files and supporting tools. Features provided by the server include alignment of up to 20 very long genomic sequences, output choices of a true, nucleotide-level multiple alignment and/or stacked, pairwise percent identity plots, and support for user-specified annotations of genomic features and arbitrary regions, with clickable links to additional information. Input sequences other than the reference can be fragmented, unordered, and unoriented.

Assuntos

Biologia Computacional/métodos , Internet , Alinhamento de Sequência/métodos , Software , Sequência de Bases , Genoma/genética , Guias como Assunto

9.

Complete Khoisan and Bantu genomes from southern Africa.

Schuster, Stephan C; Miller, Webb; Ratan, Aakrosh; Tomsho, Lynn P; Giardine, Belinda; Kasson, Lindsay R; Harris, Robert S; Petersen, Desiree C; Zhao, Fangqing; Qi, Ji; Alkan, Can; Kidd, Jeffrey M; Sun, Yazhou; Drautz, Daniela I; Bouffard, Pascal; Muzny, Donna M; Reid, Jeffrey G; Nazareth, Lynne V; Wang, Qingyu; Burhans, Richard; Riemer, Cathy; Wittekindt, Nicola E; Moorjani, Priya; Tindall, Elizabeth A; Danko, Charles G; Teo, Wee Siang; Buboltz, Anne M; Zhang, Zhenhai; Ma, Qianyi; Oosthuysen, Arno; Steenkamp, Abraham W; Oostuisen, Hermann; Venter, Philippus; Gajewski, John; Zhang, Yu; Pugh, B Franklin; Makova, Kateryna D; Nekrutenko, Anton; Mardis, Elaine R; Patterson, Nick; Pringle, Tom H; Chiaromonte, Francesca; Mullikin, James C; Eichler, Evan E; Hardison, Ross C; Gibbs, Richard A; Harkins, Timothy T; Hayes, Vanessa M.

Nature ; 463(7283): 943-7, 2010 Feb 18.

Artigo em Inglês | MEDLINE | ID: mdl-20164927

RESUMO

The genetic structure of the indigenous hunter-gatherer peoples of southern Africa, the oldest known lineage of modern human, is important for understanding human diversity. Studies based on mitochondrial and small sets of nuclear markers have shown that these hunter-gatherers, known as Khoisan, San, or Bushmen, are genetically divergent from other humans. However, until now, fully sequenced human genomes have been limited to recently diverged populations. Here we present the complete genome sequences of an indigenous hunter-gatherer from the Kalahari Desert and a Bantu from southern Africa, as well as protein-coding regions from an additional three hunter-gatherers from disparate regions of the Kalahari. We characterize the extent of whole-genome and exome diversity among the five men, reporting 1.3 million novel DNA differences genome-wide, including 13,146 novel amino acid variants. In terms of nucleotide substitutions, the Bushmen seem to be, on average, more different from each other than, for example, a European and an Asian. Observed genomic differences between the hunter-gatherers and others may help to pinpoint genetic adaptations to an agricultural lifestyle. Adding the described variants to current databases will facilitate inclusion of southern Africans in medical research efforts, particularly when family and medical histories can be correlated with genome-wide data.

Assuntos

População Negra/genética , Etnicidade/genética , Genoma Humano/genética , Povo Asiático/genética , Éxons/genética , Genética Médica , Humanos , Filogenia , Polimorfismo de Nucleotídeo Único/genética , África do Sul/etnologia , População Branca/genética

10.

PhenCode: connecting ENCODE data with mutations and phenotype.

Giardine, Belinda; Riemer, Cathy; Hefferon, Tim; Thomas, Daryl; Hsu, Fan; Zielenski, Julian; Sang, Yunhua; Elnitski, Laura; Cutting, Garry; Trumbower, Heather; Kern, Andrew; Kuhn, Robert; Patrinos, George P; Hughes, Jim; Higgs, Doug; Chui, David; Scriver, Charles; Phommarinh, Manyphong; Patnaik, Santosh K; Blumenfeld, Olga; Gottlieb, Bruce; Vihinen, Mauno; Väliaho, Jouni; Kent, Jim; Miller, Webb; Hardison, Ross C.

Hum Mutat ; 28(6): 554-62, 2007 Jun.

Artigo em Inglês | MEDLINE | ID: mdl-17326095

RESUMO

PhenCode (Phenotypes for ENCODE; http://www.bx.psu.edu/phencode) is a collaborative, exploratory project to help understand phenotypes of human mutations in the context of sequence and functional data from genome projects. Currently, it connects human phenotype and clinical data in various locus-specific databases (LSDBs) with data on genome sequences, evolutionary history, and function from the ENCODE project and other resources in the UCSC Genome Browser. Initially, we focused on a few selected LSDBs covering genes encoding alpha- and beta-globins (HBA, HBB), phenylalanine hydroxylase (PAH), blood group antigens (various genes), androgen receptor (AR), cystic fibrosis transmembrane conductance regulator (CFTR), and Bruton's tyrosine kinase (BTK), but we plan to include additional loci of clinical importance, ultimately genomewide. We have also imported variant data and associated OMIM links from Swiss-Prot. Users can find interesting mutations in the UCSC Genome Browser (in a new Locus Variants track) and follow links back to the LSDBs for more detailed information. Alternatively, they can start with queries on mutations or phenotypes at an LSDB and then display the results at the Genome Browser to view complementary information such as functional data (e.g., chromatin modifications and protein binding from the ENCODE consortium), evolutionary constraint, regulatory potential, and/or any other tracks they choose. We present several examples illustrating the power of these connections for exploring phenotypes associated with functional elements, and for identifying genomic data that could help to explain clinical phenotypes.

Assuntos

Bases de Dados Genéticas , Mutação , Fenótipo , Tirosina Quinase da Agamaglobulinemia , Antígenos de Grupos Sanguíneos/genética , Comportamento Cooperativo , Regulador de Condutância Transmembrana em Fibrose Cística/genética , Bases de Dados Genéticas/normas , Genótipo , Globinas/genética , Humanos , Internet , Fenilalanina Hidroxilase/genética , Proteínas Tirosina Quinases/genética , Receptores Androgênicos/genética , Design de Software , Integração de Sistemas

11.

HbVar database of human hemoglobin variants and thalassemia mutations: 2007 update.

Giardine, Belinda; van Baal, Sjozef; Kaimakis, Polynikis; Riemer, Cathy; Miller, Webb; Samara, Maria; Kollia, Panagoula; Anagnou, Nicholas P; Chui, David H K; Wajcman, Henri; Hardison, Ross C; Patrinos, George P.

Hum Mutat ; 28(2): 206, 2007 Feb.

Artigo em Inglês | MEDLINE | ID: mdl-17221864

RESUMO

HbVar (http://globin.bx.psu.edu/hbvar) is a locus-specific database (LSDB) developed in 2001 by a multi-center academic effort to provide timely information on the genomic sequence changes leading to hemoglobin variants and all types of thalassemia and hemoglobinopathies. Database records include extensive phenotypic descriptions, biochemical and hematological effects, associated pathology, and ethnic occurrence, accompanied by mutation frequencies and references. In addition to the regular updates to entries, we report significant advances and updates, which can be useful not only for HbVar users but also for other LSDB development and curation in general. The query page provides more functionality but in a simpler, more user-friendly format and known single nucleotide polymorphisms in the human alpha- and beta-globin loci are provided automatically. Population-specific beta-thalassemia mutation frequencies for 31 population groups have been added and/or modified and the previously reported delta- and alpha-thalassemia mutation frequency data from 10 population groups have also been incorporated. In addition, an independent flat-file database, named XPRbase (http://www.goldenhelix.org/xprbase), has been developed and linked to the main HbVar web page to provide a succinct listing of 51 experimental protocols available for globin gene mutation screening. These updates significantly augment the database profile and quality of information provided, which should increase the already high impact of the HbVar database, while its combination with the UCSC powerful genome browser and the ITHANET web portal paves the way for drawing connections of clinical importance, that is from genome to function to phenotype.

Assuntos

Bases de Dados de Ácidos Nucleicos , Variação Genética , Hemoglobinas/genética , Mutação , Talassemia/genética , Análise Mutacional de DNA/métodos , Testes Genéticos/métodos , Humanos , Família Multigênica

12.

Galaxy: a platform for interactive large-scale genome analysis.

Giardine, Belinda; Riemer, Cathy; Hardison, Ross C; Burhans, Richard; Elnitski, Laura; Shah, Prachi; Zhang, Yi; Blankenberg, Daniel; Albert, Istvan; Taylor, James; Miller, Webb; Kent, W James; Nekrutenko, Anton.

Genome Res ; 15(10): 1451-5, 2005 Oct.

Artigo em Inglês | MEDLINE | ID: mdl-16169926

RESUMO

Accessing and analyzing the exponentially expanding genomic sequence and functional data pose a challenge for biomedical researchers. Here we describe an interactive system, Galaxy, that combines the power of existing genome annotation databases with a simple Web portal to enable users to search remote resources, combine data from independent queries, and visualize the results. The heart of Galaxy is a flexible history system that stores the queries from each user; performs operations such as intersections, unions, and subtractions; and links to other computational tools. Galaxy can be accessed at http://g2.bx.psu.edu.

Assuntos

Bases de Dados Genéticas , Genoma , Evolução Biológica , Internet , Regiões Promotoras Genéticas

13.

MultiPipMaker: comparative alignment server for multiple DNA sequences.

Elnitski, Laura; Riemer, Cathy; Burhans, Richard; Hardison, Ross; Miller, Webb.

Curr Protoc Bioinformatics ; Chapter 10: Unit10.4, 2005 Apr.

Artigo em Inglês | MEDLINE | ID: mdl-18428743

RESUMO

The MultiPipMaker World Wide Web server (http://www.bx.psu.edu) provides a useful tool for aligning multiple sequences and visualizing regions of conservation between them. This unit describes the use of the MultiPipMaker server and gives an explanation of the resulting output files and supporting tools. Features provided by the server include alignment of up to 20 very long genomic sequences, output choices of a true, nucleotide-level multiple alignment or stacked, pairwise percent identity plots, and user-specified annotations for genomic features and elements of choice, with clickable links to additional information. Alignments can include unordered, unoriented secondary sequences.

Assuntos

Algoritmos , Internet , Alinhamento de Sequência/métodos , Análise de Sequência de DNA/métodos , Software , Interface Usuário-Computador , Sequência de Bases , Gráficos por Computador , Dados de Sequência Molecular

14.

Improvements to GALA and dbERGE II: databases featuring genomic sequence alignment, annotation and experimental results.

Elnitski, Laura; Giardine, Belinda; Shah, Prachi; Zhang, Yi; Riemer, Cathy; Weirauch, Matthew; Burhans, Richard; Miller, Webb; Hardison, Ross C.

Nucleic Acids Res ; 33(Database issue): D466-70, 2005 Jan 01.

Artigo em Inglês | MEDLINE | ID: mdl-15608239

RESUMO

We describe improvements to two databases that give access to information on genomic sequence similarities, functional elements in DNA and experimental results that demonstrate those functions. GALA, the database of Genome ALignments and Annotations, is now a set of interlinked relational databases for five vertebrate species, human, chimpanzee, mouse, rat and chicken. For each species, GALA records pairwise and multiple sequence alignments, scores derived from those alignments that reflect the likelihood of being under purifying selection or being a regulatory element, and extensive annotations such as genes, gene expression patterns and transcription factor binding sites. The user interface supports simple and complex queries, including operations such as subtraction and intersections as well as clustering and finding elements in proximity to features. dbERGE II, the database of Experimental Results on Gene Expression, contains experimental data from a variety of functional assays. Both databases are now run on the DB2 database management system. Improved hardware and tuning has reduced response times and increased querying capacity, while simplified query interfaces will help direct new users through the querying process. Links are available at http://www.bx.psu.edu/.

Assuntos

Bases de Dados de Ácidos Nucleicos , Genômica , Alinhamento de Sequência , Animais , Galinhas/genética , Sistemas de Gerenciamento de Base de Dados , Humanos , Camundongos , Pan troglodytes/genética , Ratos , Interface Usuário-Computador

15.

Aligning multiple genomic sequences with the threaded blockset aligner.

Blanchette, Mathieu; Kent, W James; Riemer, Cathy; Elnitski, Laura; Smit, Arian F A; Roskin, Krishna M; Baertsch, Robert; Rosenbloom, Kate; Clawson, Hiram; Green, Eric D; Haussler, David; Miller, Webb.

Genome Res ; 14(4): 708-15, 2004 Apr.

Artigo em Inglês | MEDLINE | ID: mdl-15060014

RESUMO

We define a "threaded blockset," which is a novel generalization of the classic notion of a multiple alignment. A new computer program called TBA (for "threaded blockset aligner") builds a threaded blockset under the assumption that all matching segments occur in the same order and orientation in the given sequences; inversions and duplications are not addressed. TBA is designed to be appropriate for aligning many, but by no means all, megabase-sized regions of multiple mammalian genomes. The output of TBA can be projected onto any genome chosen as a reference, thus guaranteeing that different projections present consistent predictions of which genomic positions are orthologous. This capability is illustrated using a new visualization tool to view TBA-generated alignments of vertebrate Hox clusters from both the mammalian and fish perspectives. Experimental evaluation of alignment quality, using a program that simulates evolutionary change in genomic sequences, indicates that TBA is more accurate than earlier programs. To perform the dynamic-programming alignment step, TBA runs a stand-alone program called MULTIZ, which can be used to align highly rearranged or incompletely sequenced genomes. We describe our use of MULTIZ to produce the whole-genome multiple alignments at the Santa Cruz Genome Browser.

Assuntos

Alinhamento de Sequência/métodos , Alinhamento de Sequência/tendências , Software/tendências , Animais , Sequência de Bases , Gatos , Bovinos , Biologia Computacional/métodos , Biologia Computacional/normas , Biologia Computacional/tendências , Simulação por Computador , Cães , Estudos de Avaliação como Assunto , Evolução Molecular , Genes Homeobox/genética , Genes fos/genética , Genoma , Genoma Humano , Humanos , Camundongos , Dados de Sequência Molecular , Família Multigênica/genética , Ratos , Proteínas Ribossômicas/genética , Alinhamento de Sequência/normas

16.

Improvements in the HbVar database of human hemoglobin variants and thalassemia mutations for population and sequence variation studies.

Patrinos, George P; Giardine, Belinda; Riemer, Cathy; Miller, Webb; Chui, David H K; Anagnou, Nicholas P; Wajcman, Henri; Hardison, Ross C.

Nucleic Acids Res ; 32(Database issue): D537-41, 2004 Jan 01.

Artigo em Inglês | MEDLINE | ID: mdl-14681476

RESUMO

HbVar (http://globin.cse.psu.edu/globin/hbvar/) is a relational database developed by a multi-center academic effort to provide up-to-date and high quality information on the genomic sequence changes leading to hemoglobin variants and all types of thalassemia and hemoglobinopathies. Extensive information is recorded for each variant and mutation, including sequence alterations, biochemical and hematological effects, associated pathology, ethnic occurrence and references. In addition to the regular updates to entries, we report two significant advances: (i) The frequencies for a large number of mutations causing beta-thalassemia in at-risk populations have been extracted from the published literature and made available for the user to query upon. (ii) HbVar has been linked with the GALA (Genome Alignment and Annotation database, available at http://globin.cse.psu.edu/gala/) so that users can combine information on hemoglobin variants and thalassemia mutations with a wide spectrum of genomic data. It also expands the capacity to view and analyze the data, using tools within GALA and the University of California at Santa Cruz (UCSC) Genome Browser.

Assuntos

Bases de Dados Genéticas , Variação Genética/genética , Hemoglobinas/genética , Mutação/genética , Talassemia/genética , Frequência do Gene , Genética Médica , Genética Populacional , Genoma Humano , Genômica , Humanos , Armazenamento e Recuperação da Informação , Internet , Grupos Raciais/genética

17.

MultiPipMaker and supporting tools: Alignments and analysis of multiple genomic DNA sequences.

Schwartz, Scott; Elnitski, Laura; Li, Mei; Weirauch, Matt; Riemer, Cathy; Smit, Arian; Green, Eric D; Hardison, Ross C; Miller, Webb.

Nucleic Acids Res ; 31(13): 3518-24, 2003 Jul 01.

Artigo em Inglês | MEDLINE | ID: mdl-12824357

RESUMO

Analysis of multiple sequence alignments can generate important, testable hypotheses about the phylogenetic history and cellular function of genomic sequences. We describe the MultiPipMaker server, which aligns multiple, long genomic DNA sequences quickly and with good sensitivity (available at http://bio.cse.psu.edu/ since May 2001). Alignments are computed between a contiguous reference sequence and one or more secondary sequences, which can be finished or draft sequence. The outputs include a stacked set of percent identity plots, called a MultiPip, comparing the reference sequence with subsequent sequences, and a nucleotide-level multiple alignment. New tools are provided to search MultiPipMaker output for conserved matches to a user-specified pattern and for conserved matches to position weight matrices that describe transcription factor binding sites (singly and in clusters). We illustrate the use of MultiPipMaker to identify candidate regulatory regions in WNT2 and then demonstrate by transfection assays that they are functional. Analysis of the alignments also confirms the phylogenetic inference that horses are more closely related to cats than to cows.

Assuntos

Genômica/métodos , Alinhamento de Sequência/métodos , Análise de Sequência de DNA/métodos , Software , Algoritmos , Animais , Sítios de Ligação , Gatos , Cavalos/classificação , Cavalos/genética , Internet , Filogenia , Proteínas Proto-Oncogênicas/genética , Sequências Reguladoras de Ácido Nucleico , Fatores de Transcrição/metabolismo , Proteína Wnt2

18.

EnteriX 2003: Visualization tools for genome alignments of Enterobacteriaceae.

Florea, Liliana; McClelland, Michael; Riemer, Cathy; Schwartz, Scott; Miller, Webb.

Nucleic Acids Res ; 31(13): 3527-32, 2003 Jul 01.

Artigo em Inglês | MEDLINE | ID: mdl-12824359

RESUMO

We describe EnteriX, a suite of three web-based visualization tools for graphically portraying alignment information from comparisons among several fixed and user-supplied sequences from related enterobacterial species, anchored on a reference genome (http://bio.cse.psu.edu/). The first visualization, Enteric, displays stacked pairwise alignments between a reference genome and each of the related bacteria, represented schematically as PIPs (Percent Identity Plots). Encoded in the views are large-scale genomic rearrangement events and functional landmarks. The second visualization, Menteric, computes and displays 1 Kb views of nucleotide-level multiple alignments of the sequences, together with annotations of genes, regulatory sites and conserved regions. The third, a Java-based tool named Maj, displays alignment information in two formats, corresponding roughly to the Enteric and Menteric views, and adds zoom-in capabilities. The uses of such tools are diverse, from examining the multiple sequence alignment to infer conserved sites with potential regulatory roles, to scrutinizing the commonalities and differences between the genomes for pathogenicity or phylogenetic studies. The EnteriX suite currently includes >15 enterobacterial genomes, generates views centered on four different anchor genomes and provides support for including user sequences in the alignments.

Assuntos

Enterobacteriaceae/genética , Genoma Bacteriano , Alinhamento de Sequência/métodos , Análise de Sequência de DNA/métodos , Software , Gráficos por Computador , Sequência Conservada , DNA Bacteriano/análise , Escherichia coli/genética , Componentes do Gene , Genômica/métodos , Internet , Sequências Reguladoras de Ácido Nucleico , Salmonella/genética

19.

GALA, a database for genomic sequence alignments and annotations.

Giardine, Belinda; Elnitski, Laura; Riemer, Cathy; Makalowska, Izabela; Schwartz, Scott; Miller, Webb; Hardison, Ross C.

Genome Res ; 13(4): 732-41, 2003 Apr.

Artigo em Inglês | MEDLINE | ID: mdl-12671007

RESUMO

We have developed a relational database to contain whole genome sequence alignments between human and mouse with extensive annotations of the human sequence. Complex queries are supported on recorded features, both directly and on proximity among them. Searches can reveal a wide variety of relationships, such as finding all genes expressed in a designated tissue that have a highly conserved noncoding sequence 5' to the start site. Other examples are finding single nucleotide polymorphisms that occur in conserved noncoding regions upstream of genes and identifying CpG islands that overlap the 5' ends of divergently transcribed genes. The database is available online at http://globin.cse.psu.edu/ and http://bio.cse.psu.edu/.

Assuntos

Biologia Computacional/métodos , Bases de Dados Genéticas , Genômica/métodos , Alinhamento de Sequência/métodos , Regiões 5' não Traduzidas/genética , Animais , Biologia Computacional/tendências , Ilhas de CpG/genética , Homologia de Genes/genética , Variação Genética/genética , Humanos , Internet , Polimorfismo de Nucleotídeo Único/genética

20.

PipMaker: a World Wide Web server for genomic sequence alignments.

Elnitski, Laura; Riemer, Cathy; Schwartz, Scott; Hardison, Ross; Miller, Webb.

Curr Protoc Bioinformatics ; Chapter 10: Unit 10.2, 2003 Feb.

Artigo em Inglês | MEDLINE | ID: mdl-18428692

RESUMO

PipMaker is a World-Wide Web site used to compare two long genomic sequences and identify conserved segments between them. This unit describes the use of the PipMaker server and explains the resulting output files. PipMaker provides an efficient method of aligning genomic sequences and returns a compact, but easy-to-interpret form of output, the percent identity plot (pip). For each aligning segment between two sequences the pip shows both the position relative to the first sequence and the degree of similarity. Optional annotations on the pip provide additional information to assist in the interpretation of the alignment. The default parameters of the underlying blastz alignment program are tuned for human-mouse alignments.

Assuntos

Algoritmos , Mapeamento Cromossômico/métodos , Internet , Linguagens de Programação , Alinhamento de Sequência/métodos , Análise de Sequência de DNA/métodos , Software , Inteligência Artificial , Reconhecimento Automatizado de Padrão/métodos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA