Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 22
Filtrar
1.
Nature ; 463(7283): 943-7, 2010 Feb 18.
Artículo en Inglés | MEDLINE | ID: mdl-20164927

RESUMEN

The genetic structure of the indigenous hunter-gatherer peoples of southern Africa, the oldest known lineage of modern human, is important for understanding human diversity. Studies based on mitochondrial and small sets of nuclear markers have shown that these hunter-gatherers, known as Khoisan, San, or Bushmen, are genetically divergent from other humans. However, until now, fully sequenced human genomes have been limited to recently diverged populations. Here we present the complete genome sequences of an indigenous hunter-gatherer from the Kalahari Desert and a Bantu from southern Africa, as well as protein-coding regions from an additional three hunter-gatherers from disparate regions of the Kalahari. We characterize the extent of whole-genome and exome diversity among the five men, reporting 1.3 million novel DNA differences genome-wide, including 13,146 novel amino acid variants. In terms of nucleotide substitutions, the Bushmen seem to be, on average, more different from each other than, for example, a European and an Asian. Observed genomic differences between the hunter-gatherers and others may help to pinpoint genetic adaptations to an agricultural lifestyle. Adding the described variants to current databases will facilitate inclusion of southern Africans in medical research efforts, particularly when family and medical histories can be correlated with genome-wide data.


Asunto(s)
Población Negra/genética , Etnicidad/genética , Genoma Humano/genética , Pueblo Asiatico/genética , Exones/genética , Genética Médica , Humanos , Filogenia , Polimorfismo de Nucleótido Simple/genética , Sudáfrica/etnología , Población Blanca/genética
2.
Nucleic Acids Res ; 42(Database issue): D1063-9, 2014 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-24137000

RESUMEN

HbVar (http://globin.bx.psu.edu/hbvar) is one of the oldest and most appreciated locus-specific databases launched in 2001 by a multi-center academic effort to provide timely information on the genomic alterations leading to hemoglobin variants and all types of thalassemia and hemoglobinopathies. Database records include extensive phenotypic descriptions, biochemical and hematological effects, associated pathology and ethnic occurrence, accompanied by mutation frequencies and references. Here, we report updates to >600 HbVar entries, inclusion of population-specific data for 28 populations and 27 ethnic groups for α-, and ß-thalassemias and additional querying options in the HbVar query page. HbVar content was also inter-connected with two other established genetic databases, namely FINDbase (http://www.findbase.org) and Leiden Open-Access Variation database (http://www.lovd.nl), which allows comparative data querying and analysis. HbVar data content has contributed to the realization of two collaborative projects to identify genomic variants that lie on different globin paralogs. Most importantly, HbVar data content has contributed to demonstrate the microattribution concept in practice. These updates significantly enriched the database content and querying potential, enhanced the database profile and data quality and broadened the inter-relation of HbVar with other databases, which should increase the already high impact of this resource to the globin and genetic database community.


Asunto(s)
Bases de Datos de Ácidos Nucleicos , Variación Genética , Hemoglobinas/genética , Mutación , Talasemia/genética , Genotipo , Humanos , Internet , Fenotipo , Talasemia/etnología
3.
BMC Bioinformatics ; 12 Suppl 1: S45, 2011 Feb 15.
Artículo en Inglés | MEDLINE | ID: mdl-21342577

RESUMEN

BACKGROUND: Gene clusters are genetically important, but their analysis poses significant computational challenges. One of the major reasons for these difficulties is gene conversion among the duplicated regions of the cluster, which can obscure their true relationships. Many computational methods for detecting gene conversion events have been released, but their performance has not been assessed for wide deployment in evolutionary history studies due to a lack of accurate evaluation methods. RESULTS: We designed a new method that simulates gene cluster evolution, including large-scale events of duplication, deletion, and conversion as well as small mutations. We used this simulation data to evaluate several different programs for detecting gene conversion events. CONCLUSIONS: Our evaluation identifies strengths and weaknesses of several methods for detecting gene conversion, which can contribute to more accurate analysis of gene cluster evolution.


Asunto(s)
Biología Computacional/métodos , Conversión Génica , Familia de Multigenes , Animales , Evolución Biológica , Simulación por Computador , Humanos , Primates/genética , Alineación de Secuencia
4.
BMC Evol Biol ; 11: 226, 2011 Jul 28.
Artículo en Inglés | MEDLINE | ID: mdl-21798034

RESUMEN

BACKGROUND: Gene clusters containing multiple similar genomic regions in close proximity are of great interest for biomedical studies because of their associations with inherited diseases. However, such regions are difficult to analyze due to their structural complexity and their complicated evolutionary histories, reflecting a variety of large-scale mutational events. In particular, conversion events can mislead inferences about the relationships among these regions, as traced by traditional methods such as construction of phylogenetic trees or multi-species alignments. RESULTS: To correct the distorted information generated by such methods, we have developed an automated pipeline called CHAP (Cluster History Analysis Package) for detecting conversion events. We used this pipeline to analyze the conversion events that affected two well-studied gene clusters (α-globin and ß-globin) and three gene clusters for which comparative sequence data were generated from seven primate species: CCL (chemokine ligand), IFN (interferon), and CYP2abf (part of cytochrome P450 family 2). CHAP is freely available at http://www.bx.psu.edu/miller_lab. CONCLUSIONS: These studies reveal the value of characterizing conversion events in the context of studying gene clusters in complex genomes.


Asunto(s)
Conversión Génica , Familia de Multigenes , Primates/genética , Globinas alfa/genética , Globinas beta/genética , Animales , Evolución Molecular , Genoma , Humanos , Datos de Secuencia Molecular , Filogenia , Primates/clasificación , Programas Informáticos
5.
Hum Mutat ; 28(2): 206, 2007 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-17221864

RESUMEN

HbVar (http://globin.bx.psu.edu/hbvar) is a locus-specific database (LSDB) developed in 2001 by a multi-center academic effort to provide timely information on the genomic sequence changes leading to hemoglobin variants and all types of thalassemia and hemoglobinopathies. Database records include extensive phenotypic descriptions, biochemical and hematological effects, associated pathology, and ethnic occurrence, accompanied by mutation frequencies and references. In addition to the regular updates to entries, we report significant advances and updates, which can be useful not only for HbVar users but also for other LSDB development and curation in general. The query page provides more functionality but in a simpler, more user-friendly format and known single nucleotide polymorphisms in the human alpha- and beta-globin loci are provided automatically. Population-specific beta-thalassemia mutation frequencies for 31 population groups have been added and/or modified and the previously reported delta- and alpha-thalassemia mutation frequency data from 10 population groups have also been incorporated. In addition, an independent flat-file database, named XPRbase (http://www.goldenhelix.org/xprbase), has been developed and linked to the main HbVar web page to provide a succinct listing of 51 experimental protocols available for globin gene mutation screening. These updates significantly augment the database profile and quality of information provided, which should increase the already high impact of the HbVar database, while its combination with the UCSC powerful genome browser and the ITHANET web portal paves the way for drawing connections of clinical importance, that is from genome to function to phenotype.


Asunto(s)
Bases de Datos de Ácidos Nucleicos , Variación Genética , Hemoglobinas/genética , Mutación , Talasemia/genética , Análisis Mutacional de ADN/métodos , Pruebas Genéticas/métodos , Humanos , Familia de Multigenes
6.
Hum Mutat ; 28(6): 554-62, 2007 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-17326095

RESUMEN

PhenCode (Phenotypes for ENCODE; http://www.bx.psu.edu/phencode) is a collaborative, exploratory project to help understand phenotypes of human mutations in the context of sequence and functional data from genome projects. Currently, it connects human phenotype and clinical data in various locus-specific databases (LSDBs) with data on genome sequences, evolutionary history, and function from the ENCODE project and other resources in the UCSC Genome Browser. Initially, we focused on a few selected LSDBs covering genes encoding alpha- and beta-globins (HBA, HBB), phenylalanine hydroxylase (PAH), blood group antigens (various genes), androgen receptor (AR), cystic fibrosis transmembrane conductance regulator (CFTR), and Bruton's tyrosine kinase (BTK), but we plan to include additional loci of clinical importance, ultimately genomewide. We have also imported variant data and associated OMIM links from Swiss-Prot. Users can find interesting mutations in the UCSC Genome Browser (in a new Locus Variants track) and follow links back to the LSDBs for more detailed information. Alternatively, they can start with queries on mutations or phenotypes at an LSDB and then display the results at the Genome Browser to view complementary information such as functional data (e.g., chromatin modifications and protein binding from the ENCODE consortium), evolutionary constraint, regulatory potential, and/or any other tracks they choose. We present several examples illustrating the power of these connections for exploring phenotypes associated with functional elements, and for identifying genomic data that could help to explain clinical phenotypes.


Asunto(s)
Bases de Datos Genéticas , Mutación , Fenotipo , Agammaglobulinemia Tirosina Quinasa , Antígenos de Grupos Sanguíneos/genética , Conducta Cooperativa , Regulador de Conductancia de Transmembrana de Fibrosis Quística/genética , Bases de Datos Genéticas/normas , Genotipo , Globinas/genética , Humanos , Internet , Fenilalanina Hidroxilasa/genética , Proteínas Tirosina Quinasas/genética , Receptores Androgénicos/genética , Diseño de Software , Integración de Sistemas
7.
Nucleic Acids Res ; 33(Database issue): D466-70, 2005 Jan 01.
Artículo en Inglés | MEDLINE | ID: mdl-15608239

RESUMEN

We describe improvements to two databases that give access to information on genomic sequence similarities, functional elements in DNA and experimental results that demonstrate those functions. GALA, the database of Genome ALignments and Annotations, is now a set of interlinked relational databases for five vertebrate species, human, chimpanzee, mouse, rat and chicken. For each species, GALA records pairwise and multiple sequence alignments, scores derived from those alignments that reflect the likelihood of being under purifying selection or being a regulatory element, and extensive annotations such as genes, gene expression patterns and transcription factor binding sites. The user interface supports simple and complex queries, including operations such as subtraction and intersections as well as clustering and finding elements in proximity to features. dbERGE II, the database of Experimental Results on Gene Expression, contains experimental data from a variety of functional assays. Both databases are now run on the DB2 database management system. Improved hardware and tuning has reduced response times and increased querying capacity, while simplified query interfaces will help direct new users through the querying process. Links are available at http://www.bx.psu.edu/.


Asunto(s)
Bases de Datos de Ácidos Nucleicos , Genómica , Alineación de Secuencia , Animales , Pollos/genética , Sistemas de Administración de Bases de Datos , Humanos , Ratones , Pan troglodytes/genética , Ratas , Interfaz Usuario-Computador
8.
Nucleic Acids Res ; 31(13): 3527-32, 2003 Jul 01.
Artículo en Inglés | MEDLINE | ID: mdl-12824359

RESUMEN

We describe EnteriX, a suite of three web-based visualization tools for graphically portraying alignment information from comparisons among several fixed and user-supplied sequences from related enterobacterial species, anchored on a reference genome (http://bio.cse.psu.edu/). The first visualization, Enteric, displays stacked pairwise alignments between a reference genome and each of the related bacteria, represented schematically as PIPs (Percent Identity Plots). Encoded in the views are large-scale genomic rearrangement events and functional landmarks. The second visualization, Menteric, computes and displays 1 Kb views of nucleotide-level multiple alignments of the sequences, together with annotations of genes, regulatory sites and conserved regions. The third, a Java-based tool named Maj, displays alignment information in two formats, corresponding roughly to the Enteric and Menteric views, and adds zoom-in capabilities. The uses of such tools are diverse, from examining the multiple sequence alignment to infer conserved sites with potential regulatory roles, to scrutinizing the commonalities and differences between the genomes for pathogenicity or phylogenetic studies. The EnteriX suite currently includes >15 enterobacterial genomes, generates views centered on four different anchor genomes and provides support for including user sequences in the alignments.


Asunto(s)
Enterobacteriaceae/genética , Genoma Bacteriano , Alineación de Secuencia/métodos , Análisis de Secuencia de ADN/métodos , Programas Informáticos , Gráficos por Computador , Secuencia Conservada , ADN Bacteriano/análisis , Escherichia coli/genética , Componentes del Gen , Genómica/métodos , Internet , Secuencias Reguladoras de Ácidos Nucleicos , Salmonella/genética
9.
Nucleic Acids Res ; 32(Database issue): D537-41, 2004 Jan 01.
Artículo en Inglés | MEDLINE | ID: mdl-14681476

RESUMEN

HbVar (http://globin.cse.psu.edu/globin/hbvar/) is a relational database developed by a multi-center academic effort to provide up-to-date and high quality information on the genomic sequence changes leading to hemoglobin variants and all types of thalassemia and hemoglobinopathies. Extensive information is recorded for each variant and mutation, including sequence alterations, biochemical and hematological effects, associated pathology, ethnic occurrence and references. In addition to the regular updates to entries, we report two significant advances: (i) The frequencies for a large number of mutations causing beta-thalassemia in at-risk populations have been extracted from the published literature and made available for the user to query upon. (ii) HbVar has been linked with the GALA (Genome Alignment and Annotation database, available at http://globin.cse.psu.edu/gala/) so that users can combine information on hemoglobin variants and thalassemia mutations with a wide spectrum of genomic data. It also expands the capacity to view and analyze the data, using tools within GALA and the University of California at Santa Cruz (UCSC) Genome Browser.


Asunto(s)
Bases de Datos Genéticas , Variación Genética/genética , Hemoglobinas/genética , Mutación/genética , Talasemia/genética , Frecuencia de los Genes , Genética Médica , Genética de Población , Genoma Humano , Genómica , Humanos , Almacenamiento y Recuperación de la Información , Internet , Grupos Raciales/genética
10.
Nucleic Acids Res ; 31(13): 3518-24, 2003 Jul 01.
Artículo en Inglés | MEDLINE | ID: mdl-12824357

RESUMEN

Analysis of multiple sequence alignments can generate important, testable hypotheses about the phylogenetic history and cellular function of genomic sequences. We describe the MultiPipMaker server, which aligns multiple, long genomic DNA sequences quickly and with good sensitivity (available at http://bio.cse.psu.edu/ since May 2001). Alignments are computed between a contiguous reference sequence and one or more secondary sequences, which can be finished or draft sequence. The outputs include a stacked set of percent identity plots, called a MultiPip, comparing the reference sequence with subsequent sequences, and a nucleotide-level multiple alignment. New tools are provided to search MultiPipMaker output for conserved matches to a user-specified pattern and for conserved matches to position weight matrices that describe transcription factor binding sites (singly and in clusters). We illustrate the use of MultiPipMaker to identify candidate regulatory regions in WNT2 and then demonstrate by transfection assays that they are functional. Analysis of the alignments also confirms the phylogenetic inference that horses are more closely related to cats than to cows.


Asunto(s)
Genómica/métodos , Alineación de Secuencia/métodos , Análisis de Secuencia de ADN/métodos , Programas Informáticos , Algoritmos , Animales , Sitios de Unión , Gatos , Caballos/clasificación , Caballos/genética , Internet , Filogenia , Proteínas Proto-Oncogénicas/genética , Secuencias Reguladoras de Ácidos Nucleicos , Factores de Transcripción/metabolismo , Proteína wnt2
11.
Hum Mutat ; 19(3): 225-33, 2002 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-11857738

RESUMEN

We have constructed a relational database of hemoglobin variants and thalassemia mutations, called HbVar, which can be accessed on the web at http://globin.cse.psu.edu. Extensive information is recorded for each variant and mutation, including a description of the variant and associated pathology, hematology, electrophoretic mobility, methods of isolation, stability information, ethnic occurrence, structure studies, functional studies, and references. The initial information was derived from books by Dr. Titus Huisman and colleagues [Huisman et al., 1996, 1997, 1998]. The current database is updated regularly with the addition of new data and corrections to previous data. Queries can be formulated based on fields in the database. Tables of common categories of variants, such as all those involving the alpha1-globin gene (HBA1) or all those that result in high oxygen affinity, are maintained by automated queries on the database. Users can formulate more precise queries, such as identifying "all beta-globin variants associated with instability and found in Scottish populations." This new database should be useful for clinical diagnosis as well as in fundamental studies of hemoglobin biochemistry, globin gene regulation, and human sequence variation at these loci.


Asunto(s)
Bases de Datos Genéticas , Bases de Datos de Proteínas , Variación Genética/genética , Globinas/genética , Hemoglobinas/genética , Internet , Mutación/genética , Talasemia/genética , Técnicas Genéticas , Globinas/fisiología , Hemoglobinas/fisiología , Humanos , Mutación/fisiología
12.
Gigascience ; 2(1): 17, 2013 Dec 30.
Artículo en Inglés | MEDLINE | ID: mdl-24377391

RESUMEN

BACKGROUND: Intra-species genetic variation can be used to investigate population structure, selection, and gene flow in non-model vertebrates; and due to the plummeting costs for genome sequencing, it is now possible for small labs to obtain full-genome variation data from their species of interest. However, those labs may not have easy access to, and familiarity with, computational tools to analyze those data. RESULTS: We have created a suite of tools for the Galaxy web server aimed at handling nucleotide and amino-acid polymorphisms discovered by full-genome sequencing of several individuals of the same species, or using a SNP genotyping microarray. In addition to providing user-friendly tools, a main goal is to make published analyses reproducible. While most of the examples discussed in this paper deal with nuclear-genome diversity in non-human vertebrates, we also illustrate the application of the tools to fungal genomes, human biomedical data, and mitochondrial sequences. CONCLUSIONS: This project illustrates that a small group can design, implement, test, document, and distribute a Galaxy tool collection to meet the needs of a particular community of biologists.

13.
Curr Protoc Bioinformatics ; Chapter 15: 15.2.1-15.2.27, 2012 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-22948727

RESUMEN

This unit focuses on some of the tools available on the public Galaxy server that are useful for exploring possible associations between human genetic variants and phenotypes. We trace step-by-step through an example illustrating several methods for examining a single full-coverage genome to look for single-nucleotide polymorphisms (SNPs) that are either known to be associated with disease or suspected to have impact for other reasons. It makes use of public genomic data, tools designed specifically for working with variants, and also some general tools for text manipulation and operations on genomic coordinates.


Asunto(s)
Fenotipo , Polimorfismo de Nucleótido Simple , Programas Informáticos , Variación Genética , Genoma Humano , Humanos
14.
Genome Biol Evol ; 4(4): 586-601, 2012.
Artículo en Inglés | MEDLINE | ID: mdl-22454131

RESUMEN

Many software tools for comparative analysis of genomic sequence data have been released in recent decades. Despite this, it remains challenging to determine evolutionary relationships in gene clusters due to their complex histories involving duplications, deletions, inversions, and conversions. One concept describing these relationships is orthology. Orthologs derive from a common ancestor by speciation, in contrast to paralogs, which derive from duplication. Discriminating orthologs from paralogs is a necessary step in most multispecies sequence analyses, but doing so accurately is impeded by the occurrence of gene conversion events. We propose a refined method of orthology assignment based on two paradigms for interpreting its definition: by genomic context or by sequence content. X-orthology (based on context) traces orthology resulting from speciation and duplication only, while N-orthology (based on content) includes the influence of conversion events. We developed a computational method for automatically mapping both types of orthology on a per-nucleotide basis in gene cluster regions studied by comparative sequencing, and we make this mapping accessible by visualizing the output. All of these steps are incorporated into our newly extended CHAP 2 package. We evaluate our method using both simulated data and real gene clusters (including the well-characterized α-globin and ß-globin clusters). We also illustrate use of CHAP 2 by analyzing four more loci: CCL (chemokine ligand), IFN (interferon), CYP2abf (part of cytochrome P450 family 2), and KIR (killer cell immunoglobulin-like receptors). These new methods facilitate and extend our understanding of evolution at these and other loci by adding automated accurate evolutionary inference to the biologist's toolkit. The CHAP 2 package is freely available from http://www.bx.psu.edu/miller_lab.


Asunto(s)
Evolución Molecular , Mamíferos/genética , Familia de Multigenes , Proteínas/genética , Animales , Conversión Génica , Duplicación de Gen , Genoma , Humanos , Mamíferos/clasificación , Filogenia
15.
Nat Genet ; 43(4): 295-301, 2011 Mar 20.
Artículo en Inglés | MEDLINE | ID: mdl-21423179

RESUMEN

We developed a series of interrelated locus-specific databases to store all published and unpublished genetic variation related to hemoglobinopathies and thalassemia and implemented microattribution to encourage submission of unpublished observations of genetic variation to these public repositories. A total of 1,941 unique genetic variants in 37 genes, encoding globins and other erythroid proteins, are currently documented in these databases, with reciprocal attribution of microcitations to data contributors. Our project provides the first example of implementing microattribution to incentivise submission of all known genetic variation in a defined system. It has demonstrably increased the reporting of human variants, leading to a comprehensive online resource for systematically describing human genetic variation in the globin genes and other genes contributing to hemoglobinopathies and thalassemias. The principles established here will serve as a model for other systems and for the analysis of other common and/or complex human genetic diseases.


Asunto(s)
Bases de Datos Genéticas , Variación Genética , Hemoglobinopatías/genética , Secuencia de Bases , ADN/genética , Minería de Datos , Genoma Humano , Hemoglobinas/genética , Proyecto Genoma Humano , Humanos , Datos de Secuencia Molecular , Mutación , Regiones Promotoras Genéticas , Edición
16.
Curr Protoc Bioinformatics ; Chapter 10: 10.4.1-10.4.14, 2010 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-20521245

RESUMEN

The MultiPipMaker World Wide Web server (http://www.bx.psu.edu) provides a tool for aligning multiple DNA sequences and visualizing regions of conservation among them. This unit describes its use and gives an explanation of the resulting output files and supporting tools. Features provided by the server include alignment of up to 20 very long genomic sequences, output choices of a true, nucleotide-level multiple alignment and/or stacked, pairwise percent identity plots, and support for user-specified annotations of genomic features and arbitrary regions, with clickable links to additional information. Input sequences other than the reference can be fragmented, unordered, and unoriented.


Asunto(s)
Biología Computacional/métodos , Internet , Alineación de Secuencia/métodos , Programas Informáticos , Secuencia de Bases , Genoma/genética , Guías como Asunto
17.
Curr Protoc Bioinformatics ; Chapter 10: Unit10.4, 2005 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-18428743

RESUMEN

The MultiPipMaker World Wide Web server (http://www.bx.psu.edu) provides a useful tool for aligning multiple sequences and visualizing regions of conservation between them. This unit describes the use of the MultiPipMaker server and gives an explanation of the resulting output files and supporting tools. Features provided by the server include alignment of up to 20 very long genomic sequences, output choices of a true, nucleotide-level multiple alignment or stacked, pairwise percent identity plots, and user-specified annotations for genomic features and elements of choice, with clickable links to additional information. Alignments can include unordered, unoriented secondary sequences.


Asunto(s)
Algoritmos , Internet , Alineación de Secuencia/métodos , Análisis de Secuencia de ADN/métodos , Programas Informáticos , Interfaz Usuario-Computador , Secuencia de Bases , Gráficos por Computador , Datos de Secuencia Molecular
18.
Genome Res ; 15(10): 1451-5, 2005 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-16169926

RESUMEN

Accessing and analyzing the exponentially expanding genomic sequence and functional data pose a challenge for biomedical researchers. Here we describe an interactive system, Galaxy, that combines the power of existing genome annotation databases with a simple Web portal to enable users to search remote resources, combine data from independent queries, and visualize the results. The heart of Galaxy is a flexible history system that stores the queries from each user; performs operations such as intersections, unions, and subtractions; and links to other computational tools. Galaxy can be accessed at http://g2.bx.psu.edu.


Asunto(s)
Bases de Datos Genéticas , Genoma , Evolución Biológica , Internet , Regiones Promotoras Genéticas
19.
Genomics ; 80(6): 681-90, 2002 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-12504859

RESUMEN

Sequence conservation between species is useful both for locating coding regions of genes and for identifying functional noncoding segments. Hence interspecies alignment of genomic sequences is an important computational technique. However, its utility is limited without extensive annotation. We describe a suite of software tools, PipTools, and related programs that facilitate the annotation of genes and putative regulatory elements in pairwise alignments. The alignment server PipMaker uses the output of these tools to display detailed information needed to interpret alignments. These programs are provided in a portable format for use on common desktop computers and both the toolkit and the PipMaker server can be found at our Web site (http://bio.cse.psu.edu/). We illustrate the utility of the toolkit using annotation of a pairwise comparison of the mouse MHC class II and class III regions with orthologous human sequences and subsequently identify conserved, noncoding sequences that are DNase I hypersensitive sites in chromatin of mouse cells.


Asunto(s)
Alineación de Secuencia/métodos , Programas Informáticos , Animales , Secuencia Conservada/genética , ADN/genética , Bases de Datos de Ácidos Nucleicos , Humanos , Ratones , Secuencias Repetitivas de Ácidos Nucleicos/genética
20.
Curr Protoc Bioinformatics ; Chapter 10: Unit 10.2, 2003 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-18428692

RESUMEN

PipMaker is a World-Wide Web site used to compare two long genomic sequences and identify conserved segments between them. This unit describes the use of the PipMaker server and explains the resulting output files. PipMaker provides an efficient method of aligning genomic sequences and returns a compact, but easy-to-interpret form of output, the percent identity plot (pip). For each aligning segment between two sequences the pip shows both the position relative to the first sequence and the degree of similarity. Optional annotations on the pip provide additional information to assist in the interpretation of the alignment. The default parameters of the underlying blastz alignment program are tuned for human-mouse alignments.


Asunto(s)
Algoritmos , Mapeo Cromosómico/métodos , Internet , Lenguajes de Programación , Alineación de Secuencia/métodos , Análisis de Secuencia de ADN/métodos , Programas Informáticos , Inteligencia Artificial , Reconocimiento de Normas Patrones Automatizadas/métodos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA