Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 3 de 3
Filtrar
Mais filtros








Base de dados
Intervalo de ano de publicação
1.
Genome Biol ; 16: 207, 2015 Sep 24.
Artigo em Inglês | MEDLINE | ID: mdl-26403281

RESUMO

Genome assembly projects typically run multiple algorithms in an attempt to find the single best assembly, although those assemblies often have complementary, if untapped, strengths and weaknesses. We present our metassembler algorithm that merges multiple assemblies of a genome into a single superior sequence. We apply it to the four genomes from the Assemblathon competitions and show it consistently and substantially improves the contiguity and quality of each assembly. We also develop guidelines for meta-assembly by systematically evaluating 120 permutations of merging the top 5 assemblies of the first Assemblathon competition. The software is open-source at http://metassembler.sourceforge.net .


Assuntos
Genômica/métodos , Software , Algoritmos , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Análise de Sequência de DNA
2.
Genome Biol ; 15(11): 506, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-25468217

RESUMO

BACKGROUND: The use of high throughput genome-sequencing technologies has uncovered a large extent of structural variation in eukaryotic genomes that makes important contributions to genomic diversity and phenotypic variation. When the genomes of different strains of a given organism are compared, whole genome resequencing data are typically aligned to an established reference sequence. However, when the reference differs in significant structural ways from the individuals under study, the analysis is often incomplete or inaccurate. RESULTS: Here, we use rice as a model to demonstrate how improvements in sequencing and assembly technology allow rapid and inexpensive de novo assembly of next generation sequence data into high-quality assemblies that can be directly compared using whole genome alignment to provide an unbiased assessment. Using this approach, we are able to accurately assess the "pan-genome" of three divergent rice varieties and document several megabases of each genome absent in the other two. CONCLUSIONS: Many of the genome-specific loci are annotated to contain genes, reflecting the potential for new biological properties that would be missed by standard reference-mapping approaches. We further provide a detailed analysis of several loci associated with agriculturally important traits, including the S5 hybrid sterility locus, the Sub1 submergence tolerance locus, the LRK gene cluster associated with improved yield, and the Pup1 cluster associated with phosphorus deficiency, illustrating the utility of our approach for biological discovery. All of the data and software are openly available to support further breeding and functional studies of rice and other species.


Assuntos
Variação Genética , Genoma de Planta , Oryza/genética , Locos de Características Quantitativas/genética , Cruzamento , Mapeamento Cromossômico , Sequenciamento de Nucleotídeos em Larga Escala , Fenótipo , Alinhamento de Sequência
3.
Proc Natl Acad Sci U S A ; 108(37): 15294-9, 2011 Sep 13.
Artigo em Inglês | MEDLINE | ID: mdl-21876154

RESUMO

We have entered the era of individual genomic sequencing, and can already see exponential progress in the field. It is of utmost importance to exclude false-positive variants from reported datasets. However, because of the nature of the used algorithms, this task has not been optimized to the required level of precision. This study presents a unique strategy for identifying SNPs, called COIN-VGH, that largely minimizes the presence of false-positives in the generated data. The algorithm was developed using the X-chromosome-specific regions from the previously sequenced genomes of Craig Venter and James Watson. The algorithm is based on the concept that a nucleotide can be individualized if it is analyzed in the context of its surrounding genomic sequence. COIN-VGH consists of defining the most comprehensive set of nucleotide strings of a defined length that map with 100% identity to a unique position within the human reference genome (HRG). Such set is used to retrieve sequence reads from a query genome (QG), allowing the production of a genomic landscape that represents a draft HRG-guided assembly of the QG. This landscape is analyzed for specific signatures that indicate the presence of SNPs. The fidelity of the variation signature was assessed using simulation experiments by virtually altering the HRG at defined positions. Finally, the signature regions identified in the HRG and in the QG reads are aligned and the precise nature and position of the corresponding SNPs are detected. The advantages of COIN-VGH over previous algorithms are discussed.


Assuntos
Simulação por Computador , Genoma Humano/genética , Hibridização de Ácido Nucleico/métodos , Nucleotídeos/genética , Polimorfismo de Nucleotídeo Único/genética , Cromossomos Humanos X/genética , Sondas de DNA/metabolismo , Humanos , Padrões de Referência
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA