Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 9 de 9
Filtrar
Más filtros

Banco de datos
País/Región como asunto
Tipo del documento
País de afiliación
Intervalo de año de publicación
1.
Nature ; 513(7518): 375-381, 2014 Sep 18.
Artículo en Inglés | MEDLINE | ID: mdl-25186727

RESUMEN

Cichlid fishes are famous for large, diverse and replicated adaptive radiations in the Great Lakes of East Africa. To understand the molecular mechanisms underlying cichlid phenotypic diversity, we sequenced the genomes and transcriptomes of five lineages of African cichlids: the Nile tilapia (Oreochromis niloticus), an ancestral lineage with low diversity; and four members of the East African lineage: Neolamprologus brichardi/pulcher (older radiation, Lake Tanganyika), Metriaclima zebra (recent radiation, Lake Malawi), Pundamilia nyererei (very recent radiation, Lake Victoria), and Astatotilapia burtoni (riverine species around Lake Tanganyika). We found an excess of gene duplications in the East African lineage compared to tilapia and other teleosts, an abundance of non-coding element divergence, accelerated coding sequence evolution, expression divergence associated with transposable element insertions, and regulation by novel microRNAs. In addition, we analysed sequence data from sixty individuals representing six closely related species from Lake Victoria, and show genome-wide diversifying selection on coding and regulatory variants, some of which were recruited from ancient polymorphisms. We conclude that a number of molecular mechanisms shaped East African cichlid genomes, and that amassing of standing variation during periods of relaxed purifying selection may have been important in facilitating subsequent evolutionary diversification.


Asunto(s)
Cíclidos/clasificación , Cíclidos/genética , Evolución Molecular , Especiación Genética , Genoma/genética , África Oriental , Animales , Elementos Transponibles de ADN/genética , Duplicación de Gen/genética , Regulación de la Expresión Génica/genética , Genómica , Lagos , MicroARNs/genética , Filogenia , Polimorfismo Genético/genética
2.
Nature ; 496(7445): 311-6, 2013 Apr 18.
Artículo en Inglés | MEDLINE | ID: mdl-23598338

RESUMEN

The discovery of a living coelacanth specimen in 1938 was remarkable, as this lineage of lobe-finned fish was thought to have become extinct 70 million years ago. The modern coelacanth looks remarkably similar to many of its ancient relatives, and its evolutionary proximity to our own fish ancestors provides a glimpse of the fish that first walked on land. Here we report the genome sequence of the African coelacanth, Latimeria chalumnae. Through a phylogenomic analysis, we conclude that the lungfish, and not the coelacanth, is the closest living relative of tetrapods. Coelacanth protein-coding genes are significantly more slowly evolving than those of tetrapods, unlike other genomic features. Analyses of changes in genes and regulatory elements during the vertebrate adaptation to land highlight genes involved in immunity, nitrogen excretion and the development of fins, tail, ear, eye, brain and olfaction. Functional assays of enhancers involved in the fin-to-limb transition and in the emergence of extra-embryonic tissues show the importance of the coelacanth genome as a blueprint for understanding tetrapod evolution.


Asunto(s)
Evolución Biológica , Peces/clasificación , Peces/genética , Genoma/genética , Animales , Animales Modificados Genéticamente , Embrión de Pollo , Secuencia Conservada/genética , Elementos de Facilitación Genéticos/genética , Evolución Molecular , Extremidades/anatomía & histología , Extremidades/crecimiento & desarrollo , Peces/anatomía & histología , Peces/fisiología , Genes Homeobox/genética , Genómica , Inmunoglobulina M/genética , Ratones , Anotación de Secuencia Molecular , Datos de Secuencia Molecular , Filogenia , Alineación de Secuencia , Análisis de Secuencia de ADN , Vertebrados/anatomía & histología , Vertebrados/genética , Vertebrados/fisiología
3.
Genome Res ; 22(11): 2270-7, 2012 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-22829535

RESUMEN

Exceptionally accurate genome reference sequences have proven to be of great value to microbial researchers. Thus, to date, about 1800 bacterial genome assemblies have been "finished" at great expense with the aid of manual laboratory and computational processes that typically iterate over a period of months or even years. By applying a new laboratory design and new assembly algorithm to 16 samples, we demonstrate that assemblies exceeding finished quality can be obtained from whole-genome shotgun data and automated computation. Cost and time requirements are thus dramatically reduced.


Asunto(s)
Bacterias/genética , Genoma Bacteriano , Biblioteca Genómica , Análisis de Secuencia de ADN/métodos , Algoritmos
4.
Genome Res ; 22(11): 2241-9, 2012 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-22800726

RESUMEN

Eliminating the bacterial cloning step has been a major factor in the vastly improved efficiency of massively parallel sequencing approaches. However, this also has made it a technical challenge to produce the modern equivalent of the Fosmid- or BAC-end sequences that were crucial for assembling and analyzing complex genomes during the Sanger-based sequencing era. To close this technology gap, we developed Fosill, a method for converting Fosmids to Illumina-compatible jumping libraries. We constructed Fosmid libraries in vectors with Illumina primer sequences and specific nicking sites flanking the cloning site. Our family of pFosill vectors allows multiplex Fosmid cloning of end-tagged genomic fragments without physical size selection and is compatible with standard and multiplex paired-end Illumina sequencing. To excise the bulk of each cloned insert, we introduced two nicks in the vector, translated them into the inserts, and cleaved them. Recircularization of the vector via coligation of insert termini followed by inverse PCR generates a jumping library for paired-end sequencing with 101-base reads. The yield of unique Fosmid-sized jumps is sufficiently high, and the background of short, incorrectly spaced and chimeric artifacts sufficiently low, to enable applications such as mapping of structural variation and scaffolding of de novo assemblies. We demonstrate the power of Fosill to map genome rearrangements in a cancer cell line and identified three fusion genes that were corroborated by RNA-seq data. Our Fosill-powered assembly of the mouse genome has an N50 scaffold length of 17.0 Mb, rivaling the connectivity (16.9 Mb) of the Sanger-sequencing based draft assembly.


Asunto(s)
Escherichia coli/genética , Vectores Genéticos/genética , Genoma Bacteriano , Genoma Fúngico , Biblioteca Genómica , Schizosaccharomyces/genética , Análisis de Secuencia de ADN/métodos , Animales , Reordenamiento Génico , Ratones , Ratones Endogámicos C57BL
5.
Proc Natl Acad Sci U S A ; 108(4): 1513-8, 2011 Jan 25.
Artículo en Inglés | MEDLINE | ID: mdl-21187386

RESUMEN

Massively parallel DNA sequencing technologies are revolutionizing genomics by making it possible to generate billions of relatively short (~100-base) sequence reads at very low cost. Whereas such data can be readily used for a wide range of biomedical applications, it has proven difficult to use them to generate high-quality de novo genome assemblies of large, repeat-rich vertebrate genomes. To date, the genome assemblies generated from such data have fallen far short of those obtained with the older (but much more expensive) capillary-based sequencing approach. Here, we report the development of an algorithm for genome assembly, ALLPATHS-LG, and its application to massively parallel DNA sequence data from the human and mouse genomes, generated on the Illumina platform. The resulting draft genome assemblies have good accuracy, short-range contiguity, long-range connectivity, and coverage of the genome. In particular, the base accuracy is high (≥99.95%) and the scaffold sizes (N50 size = 11.5 Mb for human and 7.2 Mb for mouse) approach those obtained with capillary-based sequencing. The combination of improved sequencing technology and improved computational methods should now make it possible to increase dramatically the de novo sequencing of large genomes. The ALLPATHS-LG program is available at http://www.broadinstitute.org/science/programs/genome-biology/crd.


Asunto(s)
Algoritmos , Genómica/métodos , Análisis de Secuencia de ADN/métodos , Programas Informáticos , Animales , Genoma/genética , Humanos , Internet , Ratones , Reproducibilidad de los Resultados
6.
Nat Genet ; 46(12): 1350-5, 2014 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-25326702

RESUMEN

Complete knowledge of the genetic variation in individual human genomes is a crucial foundation for understanding the etiology of disease. Genetic variation is typically characterized by sequencing individual genomes and comparing reads to a reference. Existing methods do an excellent job of detecting variants in approximately 90% of the human genome; however, calling variants in the remaining 10% of the genome (largely low-complexity sequence and segmental duplications) is challenging. To improve variant calling, we developed a new algorithm, DISCOVAR, and examined its performance on improved, low-cost sequence data. Using a newly created reference set of variants from the finished sequence of 103 randomly chosen fosmids, we find that some standard variant call sets miss up to 25% of variants. We show that the combination of new methods and improved data increases sensitivity by several fold, with the greatest impact in challenging regions of the human genome.


Asunto(s)
Variación Genética , Genoma Humano , Algoritmos , Secuencia de Bases , Mapeo Cromosómico , Frecuencia de los Genes , Genoma , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Datos de Secuencia Molecular , Análisis de Secuencia por Matrices de Oligonucleótidos , Reacción en Cadena de la Polimerasa , Polimorfismo de Nucleótido Simple , Reproducibilidad de los Resultados , Sensibilidad y Especificidad , Programas Informáticos
7.
Gigascience ; 2(1): 10, 2013 Jul 22.
Artículo en Inglés | MEDLINE | ID: mdl-23870653

RESUMEN

BACKGROUND: The process of generating raw genome sequence data continues to become cheaper, faster, and more accurate. However, assembly of such data into high-quality, finished genome sequences remains challenging. Many genome assembly tools are available, but they differ greatly in terms of their performance (speed, scalability, hardware requirements, acceptance of newer read technologies) and in their final output (composition of assembled sequence). More importantly, it remains largely unclear how to best assess the quality of assembled genome sequences. The Assemblathon competitions are intended to assess current state-of-the-art methods in genome assembly. RESULTS: In Assemblathon 2, we provided a variety of sequence data to be assembled for three vertebrate species (a bird, a fish, and snake). This resulted in a total of 43 submitted assemblies from 21 participating teams. We evaluated these assemblies using a combination of optical map data, Fosmid sequences, and several statistical methods. From over 100 different metrics, we chose ten key measures by which to assess the overall quality of the assemblies. CONCLUSIONS: Many current genome assemblers produced useful assemblies, containing a significant representation of their genes and overall genome structure. However, the high degree of variability between the entries suggests that there is still much room for improvement in the field of genome assembly and that approaches which work well in assembling the genome of one species may not necessarily work well for another.

8.
Genome Biol ; 10(10): R103, 2009.
Artículo en Inglés | MEDLINE | ID: mdl-19796385

RESUMEN

We demonstrate that genome sequences approaching finished quality can be generated from short paired reads. Using 36 base (fragment) and 26 base (jumping) reads from five microbial genomes of varied GC composition and sizes up to 40 Mb, ALLPATHS2 generated assemblies with long, accurate contigs and scaffolds. Velvet and EULER-SR were less accurate. For example, for Escherichia coli, the fraction of 10-kb stretches that were perfect was 99.8% (ALLPATHS2), 68.7% (Velvet), and 42.1% (EULER-SR).


Asunto(s)
Bacterias/genética , Hongos/genética , Genoma/genética , Genómica/métodos , Programas Informáticos , Emparejamiento Base/genética , Reproducibilidad de los Resultados
9.
Genome Res ; 18(5): 810-20, 2008 May.
Artículo en Inglés | MEDLINE | ID: mdl-18340039

RESUMEN

New DNA sequencing technologies deliver data at dramatically lower costs but demand new analytical methods to take full advantage of the very short reads that they produce. We provide an initial, theoretical solution to the challenge of de novo assembly from whole-genome shotgun "microreads." For 11 genomes of sizes up to 39 Mb, we generated high-quality assemblies from 80x coverage by paired 30-base simulated reads modeled after real Illumina-Solexa reads. The bacterial genomes of Campylobacter jejuni and Escherichia coli assemble optimally, yielding single perfect contigs, and larger genomes yield assemblies that are highly connected and accurate. Assemblies are presented in a graph form that retains intrinsic ambiguities such as those arising from polymorphism, thereby providing information that has been absent from previous genome assemblies. For both C. jejuni and E. coli, this assembly graph is a single edge encompassing the entire genome. Larger genomes produce more complicated graphs, but the vast majority of the bases in their assemblies are present in long edges that are nearly always perfect. We describe a general method for genome assembly that can be applied to all types of DNA sequence data, not only short read data, but also conventional sequence reads.


Asunto(s)
Algoritmos , Biología Computacional/métodos , Genoma Bacteriano/genética , Análisis de Secuencia de ADN/métodos , Campylobacter jejuni/genética , Simulación por Computador , Escherichia coli/genética , Reproducibilidad de los Resultados , Análisis de Secuencia de ADN/normas
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA