Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 6 de 6
Filtrar
1.
Methods Mol Biol ; 537: 39-64, 2009.
Artigo em Inglês | MEDLINE | ID: mdl-19378139

RESUMO

Multiple alignment of DNA sequences is an important step in various molecular biological analyses. As a large amount of sequence data is becoming available through genome and other large-scale sequencing projects, scalability, as well as accuracy, is currently required for a multiple sequence alignment (MSA) program. In this chapter, we outline the algorithms of an MSA program MAFFT and provide practical advice, focusing on several typical situations a biologist sometimes faces. For genome alignment, which is beyond the scope of MAFFT, we introduce two tools: TBA and MAUVE.


Assuntos
Algoritmos , Alinhamento de Sequência/métodos , Análise de Sequência de DNA/métodos , Software , Animais , Sequência de Bases , Biologia Computacional/métodos , Humanos , Dados de Sequência Molecular
2.
Nat Biotechnol ; 37(5): 555-560, 2019 05.
Artigo em Inglês | MEDLINE | ID: mdl-30858580

RESUMO

Standardized benchmarking approaches are required to assess the accuracy of variants called from sequence data. Although variant-calling tools and the metrics used to assess their performance continue to improve, important challenges remain. Here, as part of the Global Alliance for Genomics and Health (GA4GH), we present a benchmarking framework for variant calling. We provide guidance on how to match variant calls with different representations, define standard performance metrics, and stratify performance by variant type and genome context. We describe limitations of high-confidence calls and regions that can be used as truth sets (for example, single-nucleotide variant concordance of two methods is 99.7% inside versus 76.5% outside high-confidence regions). Our web-based app enables comparison of variant calls against truth sets to obtain a standardized performance report. Our approach has been piloted in the PrecisionFDA variant-calling challenges to identify the best-in-class variant-calling methods within high-confidence regions. Finally, we recommend a set of best practices for using our tools and evaluating the results.


Assuntos
Benchmarking , Exoma/genética , Genoma Humano/genética , Sequenciamento de Nucleotídeos em Larga Escala , Algoritmos , Genômica/tendências , Células Germinativas , Humanos , Polimorfismo de Nucleotídeo Único/genética , Software
3.
Nat Biotechnol ; 37(5): 567, 2019 05.
Artigo em Inglês | MEDLINE | ID: mdl-30899106

RESUMO

In the version of this article initially published online, two pairs of headings were switched with each other in Table 4: "Recall (PCR free)" was switched with "Recall (with PCR)," and "Precision (PCR free)" was switched with "Precision (with PCR)." The error has been corrected in the print, PDF and HTML versions of this article.

4.
Sci Transl Med ; 8(335): 335ps10, 2016 04 20.
Artigo em Inglês | MEDLINE | ID: mdl-27099173

RESUMO

Next-generation sequencing technologies are fueling a wave of new diagnostic tests. Progress on a key set of nine research challenge areas will help generate the knowledge required to advance effectively these diagnostics to the clinic.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala/métodos , Informática/métodos , Polimorfismo de Nucleotídeo Único/genética , Medicina de Precisão/métodos
5.
Genome Res ; 17(6): 760-74, 2007 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-17567995

RESUMO

A key component of the ongoing ENCODE project involves rigorous comparative sequence analyses for the initially targeted 1% of the human genome. Here, we present orthologous sequence generation, alignment, and evolutionary constraint analyses of 23 mammalian species for all ENCODE targets. Alignments were generated using four different methods; comparisons of these methods reveal large-scale consistency but substantial differences in terms of small genomic rearrangements, sensitivity (sequence coverage), and specificity (alignment accuracy). We describe the quantitative and qualitative trade-offs concomitant with alignment method choice and the levels of technical error that need to be accounted for in applications that require multisequence alignments. Using the generated alignments, we identified constrained regions using three different methods. While the different constraint-detecting methods are in general agreement, there are important discrepancies relating to both the underlying alignments and the specific algorithms. However, by integrating the results across the alignments and constraint-detecting methods, we produced constraint annotations that were found to be robust based on multiple independent measures. Analyses of these annotations illustrate that most classes of experimentally annotated functional elements are enriched for constrained sequences; however, large portions of each class (with the exception of protein-coding sequences) do not overlap constrained regions. The latter elements might not be under primary sequence constraint, might not be constrained across all mammals, or might have expendable molecular functions. Conversely, 40% of the constrained sequences do not overlap any of the functional elements that have been experimentally identified. Together, these findings demonstrate and quantify how many genomic functional elements await basic molecular characterization.


Assuntos
Evolução Molecular , Genoma Humano , Mamíferos/genética , Fases de Leitura Aberta , Filogenia , Alinhamento de Sequência , Animais , Projeto Genoma Humano , Humanos
6.
Genome Res ; 15(7): 901-13, 2005 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-15965027

RESUMO

Comparisons of orthologous genomic DNA sequences can be used to characterize regions that have been subject to purifying selection and are enriched for functional elements. We here present the results of such an analysis on an alignment of sequences from 29 mammalian species. The alignment captures approximately 3.9 neutral substitutions per site and spans approximately 1.9 Mbp of the human genome. We identify constrained elements from 3 bp to over 1 kbp in length, covering approximately 5.5% of the human locus. Our estimate for the total amount of nonexonic constraint experienced by this locus is roughly twice that for exonic constraint. Constrained elements tend to cluster, and we identify large constrained regions that correspond well with known functional elements. While constraint density inversely correlates with mobile element density, we also show the presence of unambiguously constrained elements overlapping mammalian ancestral repeats. In addition, we describe a number of elements in this region that have undergone intense purifying selection throughout mammalian evolution, and we show that these important elements are more numerous than previously thought. These results were obtained with Genomic Evolutionary Rate Profiling (GERP), a statistically rigorous and biologically transparent framework for constrained element identification. GERP identifies regions at high resolution that exhibit nucleotide substitution deficits, and measures these deficits as "rejected substitutions". Rejected substitutions reflect the intensity of past purifying selection and are used to rank and characterize constrained elements. We anticipate that GERP and the types of analyses it facilitates will provide further insights and improved annotation for the human genome as mammalian genome sequence data become richer.


Assuntos
Evolução Molecular , Genoma , Mamíferos/genética , Animais , Sequência de Bases , Biologia Computacional/métodos , Sequência Conservada , Regulador de Condutância Transmembrana em Fibrose Cística , Componentes do Gene , Perfilação da Expressão Gênica/métodos , Humanos , Sequências Repetitivas Dispersas , Dados de Sequência Molecular , Sensibilidade e Especificidade , Alinhamento de Sequência , Análise de Sequência de DNA , Homologia de Sequência do Ácido Nucleico
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA